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Abstract 

The  ability  to  find  a  low  executioa-cost  plan  efficiently  over  a  wide  domau 
applicability  is  the  core  of  domain-independent  planning  systems.  The  approach 
investigated  here  to  building  such  a  planning  system  begins  with  two  hypotheses: 
(1)  no  single  method  will  satisfy  both  sufficiency  and  efficiency  for  all  situations; 
and  (2)  multi-method  planning  can  outperform  single-method  planning  in  terms 
of  sufficiency  and  efficiency.  To  evaluate  these  hypotheses,  a  set  of  single-method 
planners  has  been  constructed.  The  results  obtained  from  the  experiments  with 
these  planners  for  the  domains  investigated  show  that  these  planners  have  trouble 
performing  efficiently  over  a  wide  range  of  problems. 

As  an  alternative  to  single-method  planning,  multi-method  planning  is  investi¬ 
gated  in  this  thesis.  A  multi-method  planner  consists  of  a  coordinated  set  of  meth¬ 
ods  which  have  different  performance  and  scope.  Given  a  set  of  created  methods, 
the  key  issue  in  multi-method  planning  is  how  to  coordinate  individual  methods  in 
an  efficient  manner  so  that  the  multi-method  planner  can  have  high  performance. 
The  multi-method  planning  framework  presented  here  provides  one  way  to  do  this 
based  on  the  notion  of  bias-relaxation.  In  a  bias-relaxation  multi-method  plan¬ 
ner,  planning  starts  by  trying  highly  restricted  and  efficient  methods,  and  then 
successively  relaxes  restrictions  until  a  sufficient  method  is  found. 

A  class  of  bias-relaxation  multi-method  planners  has  been  developed.  These 
planners  vary  in  the  granularity  at  which  individual  methods  are  selected  and 
used.  Depending  on  the  granularity  of  method  switching,  two  variations  on  strongly 
monotonic  multi-method  planners  are  implemented:  coarse-grained  multi-method 


planners,  where  methods  are  switched  on  a  problem-by-problem  basis;  and  fine¬ 
grained  multi-method  planners,  where  methods  are  switched  on  a  goal-by-goal 
basis. 

The  experimental  results  indicate  that,  at  least  for  the  domains  investigated, 
both  coarse-grained  and  fine-grained  mxilti-method  planning  can  reduce  plan  length 
significantly  compared  with  single-method  planning,  and  fine-grained  planning  can 
improve  the  planning  time  significantly  compared  with  coarse-grained  and  single¬ 
method  planning.  Application  to  a  simulated  agent  domain  also  shows  one  way 
that  multi-method  planning  can  be  used  in  more  complex  domains. 
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Chapter  1 


Introduction 


Research  in  domain-independent  planning  systems  has  been  a  main  theme  in  the 
area  of  AI  planning.  These  systems  vary  according  to  the  way  in  which  the  search 
space  is  defined  and  traversed,  the  way  in  which  plans  are  represented,  the  way  in 
which  goal  interactions  are  dealt  with,  the  way  in  which  time  and  resources  au'e 
handled,  the  way  in  which  planning  interacts  with  execution,  and  so  on  [Allen  et 
al.,  1990].  Among  the  criteria  used  to  evaluate  these  systems,  three  typical  ones 
are  the  amount  of  time  required  to  find  the  plan;  the  execution  cost  of  the  plan 
itself;  and  the  ability  to  find  some  plan,  or  an  optimal  plan,  for  any  problem  in 
an  arbitrary  domain.  Thus,  finding  a  low  execution-cost  plan  efficiently  over  a 
wide  domsun  of  applicability  is  the  core  of  dom^n-independent  planning  systems. 
The  key  issue  here  in  building  such  a  system  is  how  to  construct  a  single  planning 
method,  or  a  coordinated  set  of  different  planning  methods. 

The  hypotheses  underlying  this  research  are  (1)  no  single  method  will  satisfy 
both  sufficiency  and  efficiency  for  all  situations;  and  (2)  multi-method  planning 
can  outperform  single-method  planning  in  terms  of  sufficiency  and  efficiency.  The 
first  hypothesis  is  based  on  the  observation  that  most  conventional  planning  sys¬ 
tems  which  encode  planning  behaviors  within  a  single  fixed  method  —  such  as 
linear  planning,  nonlinear  planning,  abstraction,  and  so  on  —  have  a  limitation  in 
performing  efficiently  over  a  wide  range  of  problems. 

For  example,  STRIPS-type  planners  can  generate  plans  quite  efficiently  for  some 
problems  by  using  the  linearity  assumption  [Fikes  and  Nilsson,  1971).  With  this 
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Figure  1.1:  The  Sussman’s  anomaly  in  the  blocks- world  domain. 


assumption,  the  number  of  goal  conjuncts  considered  at  each  planning  step  can  be 
reduced,  so  that  planning  time  can  be  saved.  However,  this  tissumption  makes  the 
planners  unable  to  generate  an  optimal  plan  in  certain  domains,  and  fail  to  find 
a  plan  in  domains  with  irreversible  operators.  Sussman’s  anomaly  in  the  blocks- 
world  domain  is  a  classical  problem  where  an  optimal  solution  cannot  be  found  by 
a  linear  planner  [Sussman,  1973].  Figure  1.1  shows  the  initial  state,  goal  conjuncts, 
and  operators  for  this  problem.*  Since  a  linear  planner  does  not  consider  the  other 
goal  conjuncts  until  the  current  goal  conjunct  is  achieved,  both  goal  orderings  — 
(on  A  B)  followed  by  (on  B  C) ,  or  (on  B  C)  followed  by  (on  A  B)  —  generate 
non-optimal  operator  sequences. 


^Throughout  this  thesis,  variables  in  operators  are  denoted  by  angle  brackets,  as  in  <a>. 
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Figure  1.2:  A  problem  in  the  one- way-rocket  domain. 


A  more  serious  problem  occurs  in  domains  with  irreversible  operators.  Fig¬ 
ure  1.2  shows  a  problem  in  the  one- way-rocket  domain  which  cannot  be  solved  by 
a  linear  planner  [Veloso,  1989].  In  this  problem,  achieving  either  goal  conjunct  in¬ 
dividually  inhibits  achieving  the  other  goal  conjunct.  For  example,  after  achieving 
the  first  goal  conjunct  (at  01  LocB)  by  applying  (LOAD  01)  — »  (MOVE-ROCKET) 
—*  (UNLOAD  01),  the  second  goal  conjunct  (at  02  LocB)  cannot  be  aM:hieved  be¬ 
cause  the  Rocket  cannot  return  to  pick  up  the  remaining  object. 
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Nonlinear  planners  can  generate  optimal  plans  for  these  problems  because  they 
are  free  from  the  linearity  assumption.^  However,  for  other  problems  that  could 
be  solved  by  a  linear  planner,  nonlinear  planners  may  be  less  efficient  than  linear 
planners.  For  example,  a  nonlinear  planner  which  uses  a  goal  set  as  opposed 
to  a  goal  stack,  such  as  NOLIMIT  [Veloso,  1989],  has  more  choices  to  consider 
at  each  goal-selection  point.  This  allows  an  optimal  plan  to  be  generated  for  a 
given  problem;  however,  the  overall  planning  performance  may  be  decreased  by 
the  increased  branching  factor. 

It  has  been  known  that  partial-order  planners  can  efficiently  solve  problems  in 
which  the  specific  order  of  the  plan  steps  is  critical  [Sacerdoti,  1975,  Tate,  1977, 
Chapman,  1987,  McAllester  and  Rosenblitt,  1991,  Barrett  and  Weld,  1992].  This 
is  done  by  delaying  step-ordering  decisions  as  long  as  possible,  so  that  the  size  of 
the  plan  space  can  be  smaller  than  those  of  total-order  planners.  However,  they 
pay  the  cost  of  having  a  more  complex  ordering  procedure  [Minton  tt  ai,  1991]. 

For  example,  the  partial-order  planner  SNLP  [McAllester  and  Rosenblitt,  1991, 
Barrett  and  Weld,  1992],  detects  a  threat  between  a  step  and  a  causal  link  whenever 
a  new  step  or  causal  link  is  added.  The  ordering  procedure  searches  over  the  space 
of  ordering  constraints  to  resolve  the  detected  threat.  This  scheme  can  be  quite 
effective  if  there  are  many  threats  in  a  problem.  However,  if  there  are  only  a  few 
trivially-resolvable  threats  in  a  problem,  it  is  generally  less  efficient  to  use  such  a 
complex  threat-detecting  and  resolving  algorithm  for  the  entire  problem. 

Figure  1.3  illustrates  the  scope  and  performance  for  a  hypothetical  set  of  single¬ 
method  planners.  The  inherent  trade-off  between  a  planner’s  scope  and  its  perfor- 
mamce,  as  shown  in  the  figure,  suggests  that  single-method  planning  has  a  limita¬ 
tion  in  performing  efficiently  over  a  wide  range  of  problems,  so  that  a  more  flexible 
planning  approach  is  needed. 


^The  term  “nonlinear”  in  this  context  implies  that  it  is  allowable  to  interleave  operators  in 
service  of  different  goal  copjuncts.  It  does  not  necessarily  mean  that  either  partial-order  or 
least-commitment  planning  are  used. 
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So^ 


Figure  1.3:  Hypothetical  trade-off  between  single-method  planners’  scope  and  per¬ 
formance. 


1.1  Overview  of  the  Approach 

As  an  alternative  to  single-method  plauming,  multi-method  planning  is  investigated 
in  this  thesis.  A  multi-method  planner  is  an  integrated  system  which  utilizes  a  co¬ 
ordinated  set  of  methods,  where  each  method  has  different  scope  and  perform2tnce 
[Lee  and  Rosenbloom,  1992,  Lee  and  Rosenbloom,  1993].  The  focus  in  this  thesis  is 
on  multi-method  planning  in  a  single  serial  environment.^  Figure  1.4  shows  an  ex¬ 
ample  of  a  multi-method  planner  in  which  two  different  methods  —  linear  planning 
and  nonlinear  planning  methods  —  are  coordinated  sequentially  in  a  single  serial 
environment.  In  this  planner,  the  linear  method  has  better  overall  performance 
than  the  nonlinear  method,  while  the  nonlinear  method  can  solve  more  problems 
than  the  linear  method.  Given  a  problem,  the  linear  method  is  tried  first  to  solve 
that  problem.  If  it  cannot  solve  the  problem,  the  nonlinear  method  is  tried. 

The  potential  advantage  of  multi-method  planning  over  single-method  planning 
is  that  multi-method  planning  can  achieve  both  applicability  and  efficiency  at  the 
same  time.  Theoretically,  the  scope  of  a  multi-method  planner  can  be  the  union  of 

^Iq  a  multi-agent  environment,  multi-method  planning  can  be  accomplished  by  running  the 
methods  in  parallel  until  the  problem  is  solved  via  one  of  the  methods  [Bond  and  Gasser,  1988]. 
However,  detailed  discussion  on  this  issue  is  beyond  the  scope  of  this  thesis. 
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Figure  1.4:  An  example  of  a  multi-method  planner. 


the  scopes  of  all  individual  single-method  plsumers  in  the  multi-method  planner. 
Thus,  a  multi-method  planner  is  at  least  as  applicable  as  the  most  general  single¬ 
method  planner  within  the  multi-method  planner.  If  a  single-method  planner  is 
complete  for  a  domain  —  that  is,  it  can  solve  all  problems  in  a  domain  —  then  any 
multi-method  planner  which  includes  the  single-method  planner  is  also  complete. 
With  respect  to  efficiency,  if  a  multi-method  planner  includes  a  method  which 
is  very  efficient  for  some  classes  of  problems,  and  that  method  can  be  selected 
for  those  classes  of  problems  without  too  much  extra  effort,  then  multi-method 
planning  can  have  an  overall  efficiency  gain  over  single-method  planning. 

With  this  potential  advantage  of  multi-method  planning  in  hand,  the  ideal 
multi-method  planner  would  be  able  to  solve  each  problem  with  the  most  efficient 
method  that  is  sufficient  to  solve  it.  In  general,  however,  it  is  not  known  a  priori 
which  method  is  the  most  appropriate  one  for  a  given  problem.  The  best  way  to 
approach  this  ideal  is  to  learn  about  which  methods  to  use  for  which  classes  of 
problems  from  a  training  set  of  problems.  This  type  of  method  learning  can  be 
accomplished  by  either  an  analytical  approach  or  an  empirical  approach. 

The  analytical  approach  to  learning  is  based  on  reasoning  about  why  the  given 
training  problem  is  solved  (or  cannot  be  solved)  by  the  current  method.  If  the 
problem  is  solved  by  the  method,  one  constructs  an  explanation  which  proves 
that  the  problem  is  a  positive  instance  of  the  goal  concept  ‘solved’.  Then,  this 
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explanation  is  generalized  and  positive  control  knowledge  is  learned  which  selects 
the  method  for  later  similar  problems.  In  contrast,  if  the  method  fails  to  solve  the 
problem,  one  constructs  an  explanation  which  proves  that  the  problem  is  a  positive 
instance  of  the  goal  concept  ‘unsolvable*.  In  this  case,  negative  control  knowledge 
is  learned  which  avoids  the  method  for  later  similar  problems. 

The  empirical  approach  to  learning  is  based  on  the  performance  of  those  meth¬ 
ods  for  a  training  set  of  problems.  Instead  of  learning  control  knowledge  by  an¬ 
alyzing  a  solution  trace  for  each  problem,  this  approach  extracts  the  information 
needed  to  select  an  appropriate  method  or  to  avoid  a  set  of  inappropriate  methods 
for  the  set  of  problems  under  a  fixed  distribution.  Since  the  extracted  information 
is  a  function  of  the  problem  distribution,  this  approach  can  be  used  flexibly  for 
other  problem  distributions  or  other  domuns.  The  multi-method  planning  frame¬ 
work  investigated  in  this  thesis  is  based  on  the  empirical  approach;  however,  the 
analytical  approach  will  also  be  discussed  later  in  more  detail. 

Within  the  empirical  multi-method  planning  framework,  the  mun  goal  of  this 
research  is  to  create  a  set  of  multi-method  planners  which  are  more  efficient  and 
applicable  than  single-method  planners.  Towards  this  end,  the  basic  issues  to 
be  investigated  are:  (1)  how  to  create  individual  methods  which  have  different 
performance  and  scope  so  that  the  created  multi-method  planner  can  have  both 
highly  efficient  methods  and  highly  applicable  methods;  and  (2)  bow  to  coordinate 
the  created  methods  in  an  efficient  manner  so  that  the  multi-method  planner  can 
have  high  performance.  Each  of  these  issues  is  discussed  in  turn  in  the  following 
subsections. 

1.1.1  Method  Creation 

In  order  for  a  multi-method  planner  to  satisfy  both  efficiency  and  applicability, 
the  single  methods  in  the  multi-method  planner  should  range  from  highly  efficient 
methods  to  a  complete  method.  For  this  purpose,  a  methodology  to  create  a  set  of 
methods  with  different  performance  and  scope  is  developed.  This  methodology  is 
based  on  the  notion  of  bias  in  planning  [Rosenbloom  et  al.,  1993].  Bias  in  planning 
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is  any  basis  for  choosing  one  plan  over  another  other  than  plan  correctness.  With 

I 

the  view  of  planning  as  search  over  a  space  of  plans  [Korf,  1987],  a  bias  is  a 
restriction  over  the  space  of  plans  considered  that  determines  which  portion  of  the 
entire  plan  space  can  or  will  be  the  output  of  the  planning  process.  For  example, 
a  linearity  bias  eliminates  plans  in  which  operators  for  different  goal  conjuncts  2ure 
interleaved. 

In  general,  a  bias  can  potentially  reduce  computational  effort  by  reducing  the 
number  of  plans  that  must  be  examined,  and  it  can  potentially  generate  shorter 
plans  by  avoiding  plans  containing  inefficient  operator  sequences.  However,  this  is  * 

not  always  the  case.  For  example,  if  the  space  eliminated  by  a  bias  is  not  large 
enough  or  the  eliminated  space  does  not  include  a  sufficient  number  of  inefficient 
plans,  the  bias  has  no  effect.  Whether  or  not  these  cases  happen  relies  on  the 
domain  characteristics.  Thus,  it  is  important  to  devise  biases  which  are  really 
effective  for  a  given  domain  in  terms  of  performance  improvement. 

For  a  training  set  of  problems  in  a  given  domain,  a  bias  is  called  effective,  if  the 
average  planning  effort  for  the  biased  method  over  the  training  problem  set  is  less  ' 

than  the  average  planning  effort  for  the  method  that  does  not  use  that  bias,  and 
the  average  length  of  plans  generated  from  the  biased  method  over  the  training 
problem  set  is  less  than  the  average  length  of  the  plans  generated  from  the  method 
which  does  not  use  that  bias. 

By  varying  the  effective  biases,  a  set  of  methods  with  different  performance  and 
scope  can  be  created.  Given  a  set  of  effective  biases,  the  most  restricted  method 

—  which  uses  all  of  the  biases  —  is  the  most  efficient  one,  but  can  be  incomplete  < 

if  the  desired  plans  are  eliminated.  On  the  other  hand,  the  least  restricied  method 

—  which  uses  no  bias  —  is  the  least  efficient  one,  but  can  be  a  complete  method 
since  no  plans  are  eliminated. 

1.1.2  Method  Coordination 

Once  a  set  of  methods  with  different  performance  and  scope  is  created,  these  meth¬ 
ods  need  to  be  coordinated  efficiently  so  that  the  created  multi-method  planner  can  * 
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satisfy  both  efficiency  and  applicability.  Method  coordination,  as  used  here,  refers 
to  (1)  the  selection  of  appropriate  methods  as  situations  arise,  wd  (2)  the  granu¬ 
larity  of  method  switching  as  the  situational  demands  shift. 

Method  selection:  For  method  selection,  individual  methods  need  to  be  or¬ 
ganized  so  that  a  higher  level  control  structure  can  determine  which  method  to  use 
first  and  which  method  to  use  next  if  the  current  method  fails.  Two  straightfor¬ 
ward  ways  of  organizing  individual  methods  in  a  multi-method  planner  are  sequen¬ 
tial  and  time-shared.  A  sequential  multi-method  planner  consists  of  a  sequence  of 
single-method  planners.  A  time-shared  multi-method  planner  consists  of  a  set  of 
single-method  planners  in  which  each  method  is  active  in  turn  for  a  given  time 
slice  [Barley,  1991].  In  either  approach,  the  key  issue  is  how  to  reduce  the  effort 
wasted  in  using  inappropriate  methods. 

The  wasted  effort  in  a  sequential  multi-method  planner  is  the  cost  of  trying 
inappropriate  earlier  methods  in  the  sequence,  whereas  the  wasted  effort  in  a  time- 
shared  multi-method  planner  is  the  cost  of  trying  all  methods  in  the  method  set 
except  the  one  that  actually  solves  the  problem.  The  wasted  effort  in  sequential 
multi-method  planning  is  sensitive  to  the  ordering  of  the  methods  because  it  takes 
too  much  time  if  inappropriate  earlier  methods  are  not  efficient  enough,  or  in  an 
extreme  case,  it  may  not  be  able  to  generate  a  plan  at  all  if  one  of  the  inappropriate 
earlier  methods  does  not  halt.  On  the  other  hand,  the  wasted  effort  in  time- 
shared  multi-method  planning  is  sensitive  to  the  number  of  individual  methods. 
Also,  time-shared  multi-method  planning  switches  among  methods  more  often  than 
sequential  multi-method  planning,  and  it  has  more  overhead  for  context  switching. 

The  planning  approach  primarily  investigated  in  this  thesis  is  a  special  type 
of  sequenti2d  multi-method  planning,  called  monotonic  multi-method  planning  [Lee 
and  Rosenbloom,  1992).  In  a  monotonic  multi-method  planner,  individual  methods 
are  sequenced  so  that  the  earlier  methods  are  more  efficient  and  have  less  coverage 
than  the  later  methods.  Compared  with  the  single-method  approach  with  planner 
completeness  and  the  time-shared  multi-method  approach,  the  monotonic  multi¬ 
method  approach  can  potentially  generate  plans  more  efficiently.  The  idea  is  that 
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if  the  biases  used  in  efficient  methods  can  prune  the  search  space,  the  problems 
solvable  by  efficient  methods  should  be  solved  more  quickly,  while  problems  requir¬ 
ing  less  biases  should  not  waste  too  much  extra  time  trying  out  the  insufficient 
early  methods.  In  this  way,  a  monotonic  multi-method  planner  can  retain  plan¬ 
ner  completeness  by  allowing  the  least  restricted  method  to  be  used,  while  it  can 
generate  low  cost  plans  efficiently  by  using  more  restricted  methods. 

One  way  to  construct  a  monotonic  multi-method  planner  is  to  use  the  biases 
which  themselves  increase  efficiency.  Individual  methods  axe  sequenced  so  that 
the  set  of  biases  used  in  a  method  is  a  subset  of  the  biases  used  in  earlier  meth¬ 
ods,  and  the  later  methods  have  more  coverage  than  the  earlier  methods.  This 
means  that  planning  starts  by  trying  highly  efficient  methods,  and  then  succes¬ 
sively  relaxing  biases  until  a  sufficient  method  is  found.  This  type  of  planning 
is  called  bias-relaxation  multi-method  planning.  A  bias  relaxation  multi-method 
planner  is  not  necessarily  a  monotonic  multi-method  planner  if  there  are  interac¬ 
tions  among  biases.  However,  one  can  generate  monotonic  multi-method  planners 
via  bias-relaxation  by  just  testing  whether  monotonicity  holds  for  the  created  bias- 
relaxation  multi-method  planners.  In  bias-relaxation  multi-method  planning,  each 
bias  is  evaluated  independently  by  comparing  a  method  which  uses  that  bias  only 
and  a  method  which  uses  no  bisis.  Thus,  bias-relaxation  multi-method  planning 
has  a  restricted  scope  in  creating  and  comparing  individual  methods. 

Granularity  of  method  switching:  The  second  issue  of  method  coordination 
is  the  granularity  at  which  individual  methods  are  switched  [Lee  and  Rosenbloom, 
1993].  This  issue  is  important  in  terms  of  a  planner's  performance,  because  the 
performance  of  a  multi-method  planner  can  be  changed  according  to  the  granuleirity 
of  shifting  control  from  method  to  method.  The  family  of  multi-method  planning 
systems  can  be  viewed  on  a  granularity  spectrum.  At  one  extreme  there  is  the 
normal  single-method  approach,  where  one  method  is  selected  ahead  of  time  for 
the  entire  set  of  problems.  At  another  point  of  this  spectrum  are  coarse-grained 
multi-method  planners,  where  methods  are  switched  for  a  whole  problem  when  no 
solution  can  be  found  within  the  current  method.  Toward  the  other  extreme,  there 
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are  fine-grained  multi-method  planners,  where  methods  are  switched  at  any  point 
during  a  problem  at  which  a  new  set  of  subgoals  is  formulated.  Time-shared  multi¬ 
method  planners,  where  methods  are  switched  based  on  the  time  slice,  also  can  be 
viewed  on  the  spectrum. 

There  is  a  trade-off  between  coarse-grained  multi-method  planning  and  fine¬ 
grained  multi-method  planning.  A  coarse-grained  multi-method  planner  examines 
all  paths  within  the  current  biased  space  until  a  solution  is  found  or  all  paths  are 
exhausted.  Thus,  a  coarse-grained  multi-method  planner  finds  a  solution  within 
the  first  method  that  has  one  at  the  cost  of  searching  the  entire  biased  space  in 
the  worst  case  (unless  some  form  of  within-method  leiuning  or  heuristics  are  used 
to  prune  out  some  portions  of  the  space,  or  unless  the  time  limit  is  exceeded).  On 
the  other  hand,  a  fine-grsdned  multi-method  planner  falls  back  on  the  next  method 
whenever  the  partial  plan  for  the  current  solution  path  cannot  be  expanded  without 
violating  the  biases  used  in  the  current  method.  Thus,  it  can  save  the  effort  of 
backtracking  within  the  current  method.  However,  it  does  not  guarantee  to  find  a 
solution  that  may  exist  within  the  current  biased  space. 


1.2  Implementation 

A  set  of  single-method  planners  and  bias-relaxation  multi-method  planners  —  both 
coarse-grained  and  fine-grained  versions  —  have  been  implemented  in  the  context 
of  the  Soar  architecture  [Laird  et  ai,  1987,  Rosenbloom  et  ai,  1991].  Soar  is  a 
useful  vehicle  for  this  work  because  its  impasse-driven  subgoaling  scheme  provides 
the  necessary  context  for  planning  and  its  multiple  problem-spaw:e  scheme  facilitates 
the  multi-method  planning  approach,  though  it  is  difficult  to  implement  context 
switching  for  time-shared  multi-method  planners  in  Soar. 

Speed-up  learning  is  used  in  both  single-method  planners  and  multi-method 
planners  for  each  problem,  but  only  within-triad  transfer  was  allowed;  that  is,  rules 
learned  during  one  problem  are  not  used  for  other  problems.  However,  learned 
rules  were  allowed  to  transfer  from  an  earlier  method  to  a  later  method  (for  the 
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s&me  problem).  That  is,  if  a  search  path  is  evaluated  within  one  method,  and  the 
results  of  the  evaluation  depend  only  on  aspects  of  the  method  that  are  shared 
by  a  second  method,  then  it  should  not  be  necessary  to  repeat  that  path  when 
the  second  method  is  tried.  Learned  rules  do  not  transfer  across  trials,  because 
some  rules  are  expensive  so  that  they  may  increase  the  planning  time  for  later 
problems  [Tambe  and  Newell,  1988].  Restricting  expressiveness  such  as  by  the 
unique-attribute  scheme  can  solve  this  problem  [Tambe  et  al.,  1990];  however  this 
thesis  uses  the  multi-attribute  scheme  to  learn  rules  with  higher  generality. 

The  implemented  multi-method  planners  are  compared  with  single-method 
planners  theoretically  and  experimentally  in  three  domains:  blocks-world,  machine 
shop  scheduling,  and  a  simulated  agent  domiun.  In  this  thesis,  the  focus  is  on  plans 
represented  by  STRIPS-like  operators;  however,  since  the  multi-method  framework 
in  this  thesis  is  independent  of  the  operator  representation,  this  framework  should 
be  extendable  to  planners  with  more  expressive  plan  representations. 

1.3  Contributions 

The  primary  contributions  of  this  thesis  include  the  following: 

1.  A  methodology  for  building  a  set  of  planning  methods  with  different  per¬ 
formance  and  scope.  A  bias  determines  the  portion  of  the  entire  plan  space 
considered.  In  particular,  an  effective  bias  improves  planning  performance 
by  reducing  the  number  of  candidate  plans  and  generates  shorter  plans  by 
avoiding  inefficient  operator  sequences.  A  methodology  is  developed  to  se¬ 
lect  a  set  of  effective  biases  based  on  performance  over  a  training  problem 
set.  By  varying  the  selected  effective  biases,  a  set  of  methods  with  different 
performance  and  scope  can  be  created. 

2.  A  new  planning  framework  for  multiple  methods.  A  new  multi-method 
planning  framework  is  developed  based  on  the  relaxation  of  biases.  Issues 
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arising  in  multi>metho<l  planning,  such  as  how  to  efficiently  coordinate  in¬ 
dividual  methods  within  a  multi-method  framework  and  the  granularity  at 
which  methods  can  be  switched,  are  investigated. 

3.  Performance  improvement  over  single-method  planning.  Bias-relaxation 
multi-method  planning  with  various  granularities  of  method  switching  pro¬ 
vides  a  planning  system  which  can  improve  planning  ^dency  and  reduce 
plan  length  without  loss  of  planner  completeness.  In  fact,  for  the  domains  in¬ 
vestigated,  both  coarse-grained  and  fine-grained  multi-method  planning  can 
reduce  plan  length  significantly  compared  with  single-method  planning,  and 
fine-grained  planning  can  improve  the  planning  time  significantly  compared 
with  coarse-grained  and  single-method  planning. 

1.4  Guide  to  the  Thesis 

The  body  of  this  thesis  consists  of  six  chapters. 

Chapter  2  defines  the  notion  of  bias  in  planning.  Examples  of  planning  bias 
are  presented  along  with  the  justifications  on  which  these  biases  depend.  The 
differences  between  bias  and  search  control  heuristics  axe  described. 

Chapter  3  explains  two  bias  dimensions  —  goal  flexibility  and  goal  protection 
—  and  defines  single-method  planners  that  vary  along  these  dimensions.  The 
implementation  of  these  planners  in  Soar  is  described,  and  learning  in  Soar  for 
single-method  planning  is  discussed.  Finally,  experimental  results  in  the  blocks- 
world  and  machine-shop  scheduling  domadns  are  provided. 

Chapter  4  specifies  how  to  build  monotonic  multi-method  planners  and  bias- 
relaxation  multi-method  planners  from  a  set  of  single-method  planners.  The  issue 
of  granularity  at  which  individual  methods  can  be  switched  is  investigated,  and 
learning  in  multi-method  planning  is  discussed.  Experimental  results  for  coarse¬ 
grained  and  fine-grained  multi-method  planners  are  presented  and  compared  with 
the  results  for  single-method  planners.  Finally,  the  performance  of  multi-method 
planning  is  compared  with  the  performance  of  partial-order  planning. 


Chapter  5  shows  how  this  approach  can  be  applied  to  a  more  complex  domun 
such  as  a  simulated  agent  domain. 

Finally,  chapters  6  and  7  discuss  related  work  and  conclusions. 
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Chapter  2 


Bias  in  Planning 


Bias  was  originally  defined  in  the  context  of  concept  learning  from  preclassified 
training  instances  as  any  basis  for  choosing  one  generalization  over  another,  other 
than  strict  consistency  with  the  observed  training  instances  [Mitchell,  1980].  Trws- 
ferring  the  notion  of  bias  to  planning,  it  can  be  defined  as  “any  basis  for  choosing 
one  plan  over  another  other  than  plan  correctness”,  where  a  plan  is  correct  if  the 
application  of  the  plan  transforms  the  initial  state  into  the  goal  state  [Rx>senbloom 
et  al.,  1993]. 

The  notion  of  bias  is  useful  in  planning,  because  bias  can  reduce  computation 
effort  by  reducing  the  number  of  plans  that  must  be  examined,  and  it  can  po¬ 
tentially  generate  shorter  plans  by  avoiding  plans  containing  inefficient  operator 
sequences  such  as  ones  that  undo  achieved  goals  or  loop  on  states.  The  notion 
is  particularly  useful  in  multi-method  planning,  because  bias  can  provide  a  ba¬ 
sis  for  building  a  set  of  planning  methods  with  different  performance  and  scope. 
Also,  method  switching  in  multi-method  planning  can  be  easily  accomplished  by 
changing  the  set  of  biases  used  in  the  individual  methods. 

This  chapter  begins  with  the  notion  of  bias  in  inductive  concept  learning,  and 
then  describes  how  this  notion  is  applied  to  planning.  Some  examples  of  planning 
biases  are  presented,  and  finally,  the  relationship  between  search  control  and  bias 
is  discussed. 
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2.1  Bias  in  Inductive  Learning 


An  induction  problem  is,  in  general,  given  an  instance  description  language  and 
a  set  of  trauning  instances,  to  determine  a  generalization  that  is  consistent  with 
the  training  instances  [Mitchell,  1980].  In  induction,  an  unbiased  hypothesis  space, 
denoted  as  "H,  consists  of  every  possible  generalization  on  the  instance  space  —  that 
is,  the  power  set  of  the  training  instances.  The  unbiased  version  space,  denoted  as 

C  Ti,  is  the  portion  of  H  that  is  consistent  with  the  observed  training  instances. 
Then,  a  bias  6  determines  a  biased  hypothesis  space,  denoted  as  C  >f,  so  that 
the  output  generalization  can  be  selected  from 

It  has  been  shown  that  bias  plays  an  important  role  in  induction,  because  it 
influences  hypothesis  selection  [Utgoff,  1986).  Without  bias,  an  induction  system 
has  no  basis  for  choosing  one  generalization  over  another.  In  other  words,  bias 
enables  induction  systems  to  determine  how  to  go  beyond  the  training  instances; 
that  is,  which  inductive  leaps  to  make. 

Bias  can  be  either  absolute  or  relative.  An  absolute  bias  completely  removes 
parts  of  the  unbiased  version  space.  For  example,  a  generalization  language  pro¬ 
vides  an  absolute  bias  by  eliminating  any  element  of  the  unbiased  version  space  not 
expressible  in  the  language.  A  relative  bias  deflnes  a  partial  order  over  portions  of 
the  unbiased  version  space.  For  example,  one  can  prefer  one  hypothesis  to  another 
based  on  measures  such  as  simplicity  of  the  hypothesis  [Michalski  et  ai,  1986). 

2.2  Application  of  Bias  to  Planning 

As  in  inductive  learning,  the  notion  of  bias  can  be  formalized  in  planning. 
Planning  can  be  defined  in  terms  of  the  notion  of  a  problem  space  [Newell  et  ai, 
1991].  A  problem  space  consists  of  a  set  of  states  S,  and  a  set  of  operators  0. 
A  problem,  denoted  as  p  =  {So,Sg),  consists  of  two  components,  5o  €  5  and 
Sg  €  S,  where  50  is  a  description  of  an  initial  state  of  the  world  and  Sg  is  a  partial 
description  of  a  desired  state.  A  plan  for  a  problem  p  =  {So,  Sg)  can  be  defined  as 
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a  structure  that  represents  the  sequence  of  operators  in  O  that  achieves  Sg  from 
So  by  applying  each  operator  to  each  of  the  resulting  states  in  the  sequence. 

An  unbiased  plan  space,  denoted  as  V,  is  the  ‘‘power  sequence”  —  that  is,  the 
set  of  all  sequences  —  of  the  possible  operators  in  O.^  For  a  given  problem  p,  the 
unbiased  plan  space  for  p,  denoted  ssV,QV,  is  the  portion  of  V,  for  which  each 
element  of  V,  solves  p.  Then,  a  bias  b  determines  the  biased  plan  space,  denoted 
osV^C'P,ao  that  the  output  plan  can  be  selected  from  Vg  n  7^. 

Figure  2.1  shows  the  analogy  between  the  processes  of  inductive  concept  learn¬ 
ing  and  planning.  In  both  cases  the  output  of  the  process  is  to  be  some  element 
of  the  unbiased  hypothesis  space  that  is  consistent  with  the  process’s  correctness 
criterion.  Where  the  two  cases  differ  is  in  the  definitions  of  “unbiased  hypothesis 
space”  and  “correctness  criterion”.  In  concept  learning,  the  unbiased  hypothesis 
space  is  the  power  set  of  the  training  instances,  and  the  correctness  criterion  is  con¬ 
sistency  with  the  observed  training  instances.  In  planning,  the  unbiased  hypothesis 
space  is  the  power  sequence  of  the  possible  operators,  and  the  correctness  criterion 
is  whether  the  application  of  the  plan  achieves  the  goal  state  from  the  initial  state. 
In  spite  of  these  differences,  bias  together  with  the  process’s  correctness  criterion, 
in  both  cases,  determines  which  portion  of  the  unbiased  space  can  be  the  output 
of  the  process. 

As  in  the  case  of  induction,  an  absolute  use  of  bias  in  planning  engenders  in¬ 
completeness  in  the  planner.  This  incompleteness  can  be  used  to  speed  up  the 
planner  by  reducing  the  number  of  plans  that  the  planner  can  possibly  generate 
for  particular  problems.  However,  it  only  really  helps  if  the  bias  is  an  appropriate 
one;  otherwise,  desired  plws  can  be  eliminated.  A  relative  use  of  bias  does  not 
introduce  incompleteness.  However,  if  the  bias  is  not  an  appropriate  one,  generated 
plans  may  not  be  the  desired  ones.  Thus,  in  order  to  show  that  using  a  bias  is 
plausible,  some  form  of  appropriate  justification  is  needed.  For  example,  with  a 

^The  specification  here  assumes  that  the  plan  space  contains  only  totally-ordered  sequences 
of  operators,  but  it  does  not  rule  out  a  search  strategy  that  incrementally  specifies  an  element  of 
the  plan  space  by  refining  a  partially-ordered  plan  structure. 
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Figure  2.2:  Example  of  the  effects  of  a  linearity  bias  on  the  plan  space:  (a)  initial 
state  and  goal  conjuncts,  (b)  plan  eliminated,  (c)  plan  remaining. 


independence  justification,  one  assumes  that  goal  conjuncts  axe  achieved  by  inde¬ 
pendent  processes  without  interfering  with  other  goal  conjuncts  in  a  conjunctive 
goal  problem.  With  a  progress  justification,  one  assumes  that  it  is  always  possible 
to  move  forward  to  solve  the  problem,  and  never  required  to  move  backward.  A 
boundedness  justification  limits  the  total  effort  that  it  is  reasonable  to  expend  in 
solving  a  problem  or  a  set  of  problems.  In  the  next  section,  examples  of  planning 
biases  b9.jed  on  these  justifications  are  presented. 


2.3  Examples  of  Planning  Biases 

Two  typical  planning  biases  justified  by  an  independence  justification  are  lin¬ 
earity  and  protection.  A  linearity  bias  removes  all  plans  in  which  operators  for 
different  unachieved  goal  conjuncts  occur  in  succession;  that  is,  once  an  operator 
for  one  unachieved  goal  conjunct  is  in  the  plan,  operators  for  other  conjuncts  can 
be  placed  only  after  the  first  goal  conjunct  has  been  achieved.  For  example,  given 
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^The  notion  of  protection  used  here  was  introduced  by  Sussman  [1973].  Other  forms 
of  protection  can  be  found  in  the  planning  literature.  For  example,  one  can  protect  the 
current  goal  conjunct  from  being  clobbered  by  other  operators  while  regressing  an  opera¬ 
tor  or  a  goal  through  a  partial  linear  plan  [Warren,  1976,  Waldinger,  1977].  In  partial- 
order  planning,  one  can  protect  a  causal  link  from  being  clobbered  by  any  other  plan¬ 
ning  steps,  within  the  interval  where  the  causal  link  is  needed  [Tate,  1977,  Chapman,  1987, 
Barrett  and  Weld,  1992]. 
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Figure  2.4:  Example  of  the  effects  of  a  directness  bias  on  the  plan  space;  (a)  initial 
state  and  goal  conjuncts,  (b)  plan  eliminated,  (c)  plan  remaining. 


operator  (MOVE  A  Table)  undoes  the  goal  conjunct  (on  A  B)  which  is  established 
by  the  earlier  operator  (MOVE  A  B),  while  plans  such  as  the  one  in  Figure  2.3(c) 
would  remain.  Protection  is  based  on  an  independence  justification  since  one  as¬ 
sumes  that  while  solving  one  goal  conjunct,  operators  that  interact  negatively  with 
previous  goal  conjuncts  need  not  be  considered. 


A  progress  justification  underlies  all  greedy  biases.  For  example,  protection  is 
also  justified  by  a  progress  justification,  because  once  a  goal  is  aM:hieved,  it  would 
never  be  undone.  Another  bias  justified  by  a  progress  justification  is  directness. 
A  directness  bias  eliminates  all  plans  in  which  there  is  at  least  one  operator  that 
does  not  directly  achieve  a  goal  conjunct  included  in  the  problem  definition.  For 
example,  given  the  goal  conjuncts  and  operators  in  Figure  2.4(a),  plans  such  as 
the  one  in  Figure  2.4(b)  would  be  eliminated  since  the  operator  (MOVE  B  A)  does 
not  directly  achieve  any  of  the  goal  conjuncts  in  the  problem  definition,  while 
plans  such  as  the  one  in  Figure  2.4(c)  would  remain.  Directness  is  justified  by 
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(tm  Gl:(oo  ATable) 

G2:  (clear  C) 

laitial  State  Gcal 

(•) 

(MOVE  AD)  — *■  (MOVE  B  Ittle)  — (MOVEATaUe) 

For  (dear  B) _ _ 

For  02  (clear  C)  Ford 

(b) 

(MOVE  A  Table)  -*>  (MOVE  B  Table) 

FbrCl  Ford 

(c) 

Figiure  2.5:  Example  of  the  effects  of  a  goal-nonrtpetition  bias  on  the  plan  space; 
(a)  initial  state  and  goal  conjuncts,  (b)  plan  eliminated,  (c)  plan  remaining. 

a  progress  justification,  because  whenever  an  operator  is  applied,  a  new  goal  is 
always  achieved,  increasing  the  degree  of  goal  achievement  for  the  entire  problem. 
Directness  is  a  quite  interesting  bias  because  it  ensures  that  the  number  of  operators 
to  achieve  each  of  the  goal  conjuncts  is  bounded  by  one. 

Biases  justified  by  a  boundedness  justification  include  goal-depth,  goal-breadth, 
plan  length,  and  goal-nonrepetition.  Both  goal-depth  and  goal-breadth  limit  the 
size  of  the  goal  hierarchy  used  in  planners  based  on  means-ends  analysis  (MEA), 
so  that  planning  effort  can  be  reduced.  For  a  predefined  bound  n,  a  goal-depth 
bias  eliminates  from  the  nypothesis  space  all  plans  that  require  more  than  n  levels 
of  subgoals  to  generate,  while  a  goal-breadth  bias  eliminates  all  plans  that  require 
more  than  n  conjunctive  subgoals  for  a  single  goal.  Directness  is  also  justified  by 
boundedness.  In  fact,  directness  can  be  viewed  as  a  special  case  of  goal-depth  bias, 
since  it  allows  no  generation  of  subgoals,  thus  ensuring  that  the  depth  of  the  goal 
hierarchy  is  bounded  by  one. 
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A  plan-length  bias  eliminates  all  plans  which  consist  of  a  sequence  of  more  than 
n  operators,  for  a  predefined  bound  n,  so  that  the  length  of  the  output  plan  can 
be  bounded.  It  can  be  used  either  on  a  problem-by-problem  basis  or  on  a  goal- 
by-goal  basis.  If  plan-length  is  used  on  a  problem-by-problem  basis,  the  length  of 
the  output  plan  for  the  entire  problem  is  guaranteed  to  be  no  more  than  n.  If  it  is 
used  on  a  goal-by-goal  basis  for  a  set  of  conjunctive  goals,  the  length  of  the  plan 
for  the  conjunctive  goals  is  bounded  by  n  times  the  number  of  goal  conjuncts.  In 
fact,  directness  is  also  a  special  case  of  goal-by-goal  plan-length  bias,  where  n  =  1, 
because  the  length  of  the  plan  for  each  goal  is  bounded  by  one. 

A  goal-nonrepetition  bias  eliminates  all  plans  that  require  a  repetition  on  a  goal 
literal;  that  is,  if  satisfying  an  immet  precondition  for  a  selected  operator  requires 
a  new  goal  conjunct  whose  literal  is  equivalent  to  the  literal  of  its  ancestor  in  the 
goal  hierarchy,  then  that  plan  would  be  eliminated.  For  example,  given  the  goal 
conjuncts  and  operators  in  Figure  2.5(a),  plans  such  as  the  one  in  Figure  2.5(b) 
would  be  eliminated  because  it  requires  a  repetition  on  a  goal  literal  —  operator 
(MOVE  B  Table)  is  chosen  for  conjunct  (clear  C),  but  in  making  it  applicable, 
an  iterative  clear  conjunct  (clear  B)  is  generated  (resulting  in  the  selection  of 
(MOVE  A  D)  as  the  first  operator).  On  the  other  hand,  plans  such  as  the  one  in 
Figure  2.5(c)  would  remain.  The  prime  reason  to  use  a  goal-nonrepetition  bias 
is  that  it  forces  lesuning  from  non-repetitive  paths  by  eliminating  all  plans  that 
require  a  repetition  on  a  goal  conjunct,  so  that  learning  specific  rules  for  each 
size  of  repetition  can  be  avoided.  In  this  way,  it  is  closely  related  to  Etzioni's 
[l990b]  work  on  restricting  EBL  to  learn  from  only  non-recursive  explanations. 
This  relationship  will  be  discussed  later  in  more  detail. 

2.4  Relationship  with  Search  Control 

As  described  before,  bias  affects  the  output  of  the  planning  process.  However,  if 
bias  can  be  incorporated  directly  into  the  planning  procedure,  then  it  can  also 
have  a  significant  impact  on  the  efficiency  of  the  planning  process  by  reducing  the 


number  of  candidate  plans  that  are  generated.  In  this  way,  bias  can  lead  to  effective 
control  of  search. 

1.  If  the  goal  of  the  problem  is  achieved,  show  the  output  plan  and  stop; 
else  continue 

2.  Select  a  goal  from  the  goal  hierarchy. 

3.  Select  an  operator  to  achieve  the  selected  goal. 

4.  If  the  selected  operator  is  applicable  to  the  current  state,  create  a 
new  state  by  applying  the  operator,  and  remove  achieved  goals  from 
the  goal  hierarchy.  Go  to  step  1. 

5.  If  the  selected  operator  is  not  applicable  to  the  current  state,  create 
subgoals  to  establish  the  unmet  preconditions  of  the  operator  and 
add  them  to  the  goal  hierarchy.  Go  to  step  1. 

Figure  2.6:  A  recursive  planning  procedure  based  on  means  ends  analysis. 

For  example,  consider  a  recursive  planning  procedure  based  on  means-ends 
analysis,  as  shown  in  Figure  2.6  Table  2.1  shows  the  planning  biases  classified 
according  to  the  way  they  can  be  incorporated  into  this  procedure.  Linearity  can 
be  incorporated  into  goal  selection  (step  2)  by  selecting  a  new  goal  conjunct  only 
after  the  current  goal  conjunct  is  achieved.  Protection  and  goal-length  can  be  in¬ 
corporated  into  operator  selection  (step  3)  by  rejecting  operators  which  violate  the 
criterion  for  the  bias.  Goal-depth,  goal-breadth,  directness,  and  goaJ-nonrepetition 
can  be  incorporated  into  goal  expansion  (step  5)  by  limiting  the  expansion  of  the 
goal  hierarchy. 

However,  despite  this  close  relationship  between  search  control  and  bias,  there 
is  a  distinction  between  the  two.  Bias  determines  which  plan  is  generated  from 
the  plan  space.,  while  search  strategies  determine  the  efficiency  with  which  that 
plan  is  found  from  the  search  space.  In  general,  the  search  space  is  not  necessarily 
equivalent  to  the  plan  space.  For  example,  a  node  in  the  search  space  for  a  MEA- 
based  planner  with  the  above  procedure  can  be  defined  as  a  combination  of  the 

^This  algorithm  is  comparable  to  the  one  used  in  NOLIMIT  [Veloso,  1989], 
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Class 

Bias 

1  Justification 

Independence 

Progress 

Boundedness 

Goal  selection 

Linearity 

0 

Operator  selection 

Protection 

0 

0 

Plan-length 

0 

Goal  expansion 

Directness 

0 

0 

Goal-depth 

0 

Goal-breadth 

0 

Goal-nonrepetition 

0 

Table  2.1:  Examples  of  planning  biases  and  their  justifications  classified  according 
#  to  a  MEA-based  planning  procedure. 


current  state  and  goal  hierarchy,  whereas  a  node  in  the  plan  space  for  this  planner 
^  can  be  defined  as  a  partial  sequence  of  operators.  Whenever  an  operator  is  applied 

to  the  current  state,  a  node  is  expanded  both  in  the  search  space  and  the  plam  space. 
However,  if  the  selected  operator  is  not  applicable  to  the  current  state,  a  node  in 
the  plan  space  is  not  expanded,  while  a  node  in  the  search  space  is  expanded  for 
#  the  new  goal  hierarchy. 


2.5  Summary 

In  this  chapter,  the  notion  of  bias  is  applied  to  planning.  An  analogy  between  the 
processes  of  concept  learning  and  planning  is  presented  in  terms  of  the  usage  of  bias 
in  the  process.  Since  bias  determines  which  portion  of  the  unbiased  space  can  be 
the  output  of  the  process,  using  an  appropriate  bias  is  critical;  if  it  is  too  weak  it  has 
no  effect,  but  if  it  is  too  strong  it  can  eliminate  the  desired  output.  Some  examples 
of  planning  biases  are  introduced  which  can  be  justified  by  independence,  progress, 
or  boundedness.  The  relationship  between  search  control  and  bias  is  discussed. 
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Chapter  3 


Single-Method  Planners 


As  described  in  the  introduction,  the  first  hypothesis  underlying  this  research  is 
that  no  single  method  will  satisfy  both  sufficiency  and  efficiency  for  all  situations. 
The  ideal  way  to  evaluate  the  hypothesis  would  be  to  construct  all  possible  single¬ 
method  planners  and  to  evaluate  their  performance  and  scope  for  every  possible 
domun.  However,  this  is  not  possible.  In  this  research,  a  system  is  constructed 
that  can  utilize  a  set  of  different  planning  methods,  which  vary  in  the  amoimt  of 

bias  used.  These  methods  are  implemented  in  the  context  of  Soar,  an  architecture 

« 

which  integrates  basic  capabilities  for  problem-solving,  use  of  knowledge,  learning, 
and  perceptual-motor  behavior  [Laird  et  al.,  1987,  Rosenbloom  et  al.,  199l]. 

Soar  has  not  traditionally  been  seen  as  a  planning  architecture,  partly  because 
it  does  not  create  structures  that  resemble  traditional  plans,  such  as  totally-ordered 
plans  or  partially-ordered  plans,  and  partly  because  its  problem-solving  approach 
does  not  closely  resemble  the  traditional  planning  methods  [Rx>senbloom  tt  al., 
1993].  However,  recent  work  on  a  Soar-based  framework  for  pl2Lnning  has  demon¬ 
strated  how  versions  of  such  standard  planning  methods  as  linear,  nonlinear,  and 
abstraction  planning  can  be  derived  from  the  Soar  architecture  [Rosenbloom  et  ai, 
1990). 

This  chapter  begins  with  an  overview  of  planning  in  Soar  and  introduces  a  set 
of  different  planning  methods  as  implemented  in  Soar  (version  6).  The  effect  of 
learning  in  these  methods  with  respect  to  the  performance  of  planning  is  discussed. 
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Finally,  these  methods  are  evaluated  experimentally  in  terms  of  planner  complete¬ 
ness  (for  sufficiency),  planning  time  and  plan  length  (for  planning  efficiency  and 
execution  efficiency,  respectively)  in  two  domiuns. 


3.1  Planning  in  Soar 

3.1.1  Overview  of  Soar 

Soar  is  based  on  the  hypothesis  that  all  symbolic  goal-oriented  behavior  may 
be  represented  in  terms  of  problem  spaces  [Newell  et  at.,  1991].  A  problem  space  is 
defined  by  a  set  of  states  and  a  set  of  operators.  The  states  represent  situations, 
and  the  operators  represent  actions  which  apply  to  current  states  to  }deld  new 
states.  Problem-solving  in  Soar  is  driven  by  applying  operators  to  states  within  a 
problem  space  to  achieve  a  goal.  A  goal  context  consists  of  a  goal,  together  with 
the  current  problem  space,  state,  and  operator. 

Figure  3.1  illustrates  the  architectural  structure  of  Soar.  Knowledge  is  stored  in 
a  permanent  recognition  memory  and  a  temporary  working  memory.  Recognition 
memory  consists  of  a  set  of  variabilized  rules,  where  each  rule  is  a  condition-action 
pair.  The  conditions  of  each  rule  match  agunst  the  content  of  working  memory. 
Conditions  can  contadn  variables,  so  that  a  single  condition  can  match  ag^nst 
different  data  in  working  memory.  If  the  conditions  of  a  rule  are  matched,  the 
actions  of  the  rule  are  instantiated  to  propose  preferences  that  change  the  work¬ 
ing  memory.  The  most  typical  preferences  are  feasibility  (acceptable,  reject)  and 
desirability  (best,  better,  indifferent,  worse,  worst)  preferences.  These  preferences 
are  held  in  preference  memory  and  used  by  a  decision  procedure  to  determine  what 
changes  are  made  to  working  memory.* 

If  the  system  does  not  have  sufficient  information  about  a  situation  to  make 
a  decision  for  that  situation,  then  an  impasse  arises.  For  example,  if  the  system 

*The  current  Soar  version  has  a  separate  preference  memory,  which  is  not  included  in  this 
#  figure  for  simplicity. 


27 


Rttcogrtftion  Memory 


Figure  3.1:  The  Soar  architecture. 
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is  tmabie  to  select  the  next  operator  from  a  set  of  candidate  operators,  then  an 
impasse,  called  a  tit  impasse  (or  selection  impasse)  arises.  Other  types  of  impasses 
are  generated  if  the  system  fails  to  generate  a  set  of  candidate  operators  {generation 
impasse),  or  fails  to  execute  the  selected  operator  {execution  impasse)  [Rosenbloom 
et  al.,  1990]. 

In  response  to  an  impasse,  a  subgoal  is  automatically  generated.  Within  the 
subgoal,  Soar  searches  for  more  information  that  can  lead  to  the  restJution  of  the 
impasse.  As  the  result  of  the  subgoal,  new  preferences  are  generated  and  new  rules 
are  learned  (via  a  chunking  process)  whose  actions  are  based  on  the  preferences 
that  are  the  results  of  the  subgoal,  and  whose  conditions  are  based  on  the  working- 
memory  elements  in  supergoals  that  led  to  the  results.  In  effect,  chunking  is  much 
like  explanation-based  learning  [Rosenbloom  and  Laird,  1986). 

Note  that  the  notions  of  subgoal  and  operator  in  a  Soar  should  be  distinguished 
from  those  :n  traditional  planning.  A  Soar  subgoal  is  generated  in  response  to  an 
impasse  whenever  progress  cannot  be  made  on  the  current  goal,  and  terminated 
when  the  impasse  is  resolved.  On  the  other  hand,  a  planning  subgoal  is  generated 
in  response  to  a  precondition  violation  and  terminated  when  the  violated  condi¬ 
tion  is  achieved.  A  precondition  violation  may  or  may  not  create  an  impasse  in 
Soar  depending  on  whether  or  not  knowledge  to  achieve  the  violated  condition  is 
available  in  the  current  goal  context. 

In  the  planning  framework  for  this  research,  planning  goals  (together  with 
subgoals)  and  their  hierarchy  are  explicitly  represented  as  augmentations  of  Soar 
states.  Precondition  violation  is  handled  in  a  single  goal  context  without  creating  a 
Soar  subgoal.  However,  if  there  is  no  information  about  how  to  apply  an  operator 
(yielding  an  execution  impasse)  or  how  to  select  among  the  candidate  operators 
(yielding  a  tie  impasse),  a  Soar  subgoal  is  created.  A  planning  operator  is  repre¬ 
sented  as  a  set  of  variabilized  rules  which  create  and  apply  an  instantiated  Soar 
operator  to  change  the  current  Soar  state,  where  the  current  planning  state  and 
the  goal  hierarchy  are  represented. 
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1.  If  the  goal  of  the  problem  is  achieved,  stop;  else  continue. 

2.  Select  an  operator  to  achieve  one  of  the  active  goal  conjuncts  in  the 
goal  hierarchy. 

3.  If  the  selected  operator  is  applicable  to  the  current  state,  create  a 
new  state  by  applying  the  operator,  and  remove  achieved  goals  from 
the  goal  hierarchy.  Go  to  step  1. 

4.  If  the  selected  operator  is  not  applicable  to  the  current  state,  create 
subgoals  to  establish  the  unmet  preconditions  of  the  operator  and 
add  them  to  the  goal  hierarchy.  Go  to  step  2. 

Figure  3.2:  The  planning  algorithm  based  on  means-ends  analysis  as  implemented 
in  Soar. 


In  this  work,  the  predominant  planning  method  in  Soar  is  means-ends  analysis 
(MEA).  The  version  of  means-ends  analysis  implemented  in  this  work  is  close  to  the 
algorithm  described  in  Figure  2.6.  Figure  3.2  shows  the  skeleton  of  the  MEA-based 
planning  algorithm  implemented  in  Soar  for  the  framework  of  this  thesis.  There  are 
only  two  differences  between  this  algorithm  and  the  one  shown  in  Figure  2.6.  First, 
in  this  algorithm,  a  goal  conjtmct  is  selected  implicitly  from  the  goal  hierarchy  when 
an  operator  is  selected  in  step  2.  By  merging  two  steps  (goal  selection  and  operator 
selection)  into  a  single  operator  selection  step,  the  number  of  decisions  required  to 
generate  a  plan  can  be  reduced.  Second,  there  is  no  explicit  output  plan  to  print 
in  step  1  in  this  algorithm.  This  is  because  a  plan  in  Soar  is  rarely  represented  as 
a  unitary  entity  like  a  totally-ordered  or  partially-ordered  plan.  Instead,  a  plan  in 
Soar  is  r^resented  as  a  set  of  control  rules  or  a  set  of  preferences  which  jointly 
specify  which  operators  should  be  executed  at  each  point  in  time. 

In  the  following  sections,  operator  representation,  plan  representation,  and 
planning  in  So2ur  are  described  in  more  detail. 

3.1.2  Operator  Representation  in  Soar 

In  the  planning  framework  for  this  thesis,  the  implementations  of  planning  op¬ 
erators  are  represented  by  three  classes  of  variabilized  rules  in  recognition  memory 
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If  the  problem-space  is  hloek8~v>orid 

A  There  exists  an  active  goal  (on  <x>  <y>) 

A  (on  <x>  <y>)  is  not  achieved 

Propose  an  operator  (MOVE  <x>  <y>)  for  (on  <x>  <y>) 

(a)  An  operator  proposal  rule  for  (on  <x>  <y>). 

If  the  problem-space  is  hlocka-wofid 

A  There  exists  an  active  goal  (dear  <x>) 

A  (dear  <x>)  is  not  achieved 
A  There  exists  a  block  <top>  on  top  of  <x> 

A  There  exists  an  object  <y>  which  is  different  from  <x>  and  <top> 
Propose  an  operator  (MOVE  <top>  <y>)  for  (dear  <x>) 

(b)  An  operator  proposal  rule  for  (dear  <x>). 

Figure  3.3:  Examples  of  operator  propossJ  rules. 


—  operator  proposal  rules,  operator  application  rules,  and  goal  expansion  rules 

—  plus  instantiated  operator  objects  in  working  memory.  An  operator  proposal 
rule  implements  a  bit  of  means-ends  analysis,  determining  when  it  is  appropriate 
to  propose  operators.  This  rule  is  instantiated  (possibly  multiple  times)  based  on 
the  current  goal  hierarchy  represented  in  the  working  memory,  ci  eating  a  set  of 
instantiated  Soar  operators  in  the  working  memory. 


Figure  3.3  shows  examples  of  operator  proposal  rules  in  the  blocks-world  do- 
maun.  In  our  implementation  of  blocks-world,  there  is  a  single  general  operator, 
MOVE,  which  moves  a  block  from  one  location  to  another.  However,  depending  on 
the  type  of  goal  this  operator  is  trying  to  achieve,  different  operator  proposal  rules 
can  be  specified,  as  shown  in  Figure  3.3(a)  and  (b).  A  single  operator  proposal  rule 
can  be  instantiated  with  different  components  of  the  state,  yielding  multiple  instan¬ 
tiated  operators.  In  Figure  3.3(a),  for  example,  if  the  goal  of  a  problem  is  to  stack 
a  set  of  n  blocks,  represented  by  (and  (on  Block\  Block2)  (on  Blocki  Blocks) 
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If  the  problem-space  is  blocka-world 
A  The  operator  is  (MOVE  <x>  <2>)  for  goal  <w> 

A  <x>  is  on  <y> 

A  <x>  and  <z>  are  clear 
A  <z>  is  the  table 

<x>  is  not  on  <y> 

A  <x>  is  on  <z> 

A  <y>  is  dear 
A  <w>  is  achieved 

(a)  An  operator  application  rule  to  put  down  a  block  onto  the  table. 

If  the  problem-space  is  hlocks^world 
A  The  operator  is  (MOVE  <x>  <z>)  for  goal  <w> 

A  <x>  is  on  <y> 

A  <x>  and  <z>  are  dear 
A  <z>  is  a  block 

=» 

<x>  is  not  on  <y> 

A  <x>  is  on  <z> 

A  <y>  is  dear 
A  <z>  is  not  dear 
A  <w>  is  achieved 

(b)  An  operator  application  rule  to  stack  a  block  onto  another  block. 

Figure  3.4:  Examples  of  operator  application  rules. 


(on  Blockn-i  Blocks)),  then  (on  <x>  <y>)  is  instantiated  with  each  of 
the  n  —  1  goal  conjuncts  when  the  problem  solving  starts. 

Once  an  operator  has  been  selected  for  the  current  state  by  the  decision  pro¬ 
cedure,  it  can  be  applied  to  generate  a  new  state  if  its  preconditions  are  met. 
Figure  3.4  shows  examples  of  operator  application  rules  for  the  MOVE  operator.  In 
this  implementation  of  blocks-world,  two  operator  application  rules  are  used  for 
this  operator:  one  to  put  down  a  block  onto  the  table,  and  one  to  stack  a  block 
onto  another  block.  Once  an  operator  has  been  applied,  operator  proposal  rules 
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If  the  problem-space  is  bloeks-vortd 
A  The  operator  is  (MOVE  <x>  <z>)  for  goal  <w> 

A  <x>  is  not  dear 

Create  a  new  goal  <new>  to  dear  <x> 

A  The  parent  of  <new>  is  <w>  in  the  goal  hierarchy 

(a)  A  goal  expansion  rule  in  which  the  block  to  be  moved  is  not  dear. 

If  the  problem-space  is  blocks-toorld 

A  The  operator  is  (MOVE  <x>  <z>)  for  goal  <w> 

A  <z>  is  not  dear 

Create  a  new  goal  <new>  to  dear  <z> 

A  The  parent  of  <new>  is  <w>  in  the  goal  hierarchy 

(b)  A  goal  expansion  rule  in  which  the  destination  block  is  not  dear. 

Figure  3.5:  Examples  of  goal  expansion  rules. 

are  matched  the  new  state  and  the  updated  goal  hierarchy,  generating  the 
next  set  of  candidate  operators. 

If  the  selected  operator  is  not  applicable,  goal  expansion  rules  (as  shown  in  Fig¬ 
ure  3.5)  are  instantiated  to  generate  a  new  goal  hierarchy.  In  effect,  goal  expansion 
rules  implement  operator  subgoaling  in  means-ends  analysis. 

3.1.3  Plan  Representation  in  Soar 

As  described  in  Chapter  2,  a  plan  for  a  problem  can  be  defined  as  a  structure 
that  represents  the  sequence  of  actions  to  be  t2d(en  for  that  problem  [Rosenbloom 
et  a/.,  1993].  With  this  definition  of  a  plan  in  hand,  two  predominant  structures 
can  be  identified  that  serve  as  plans  in  Soar.  The  first  structure  is  the  set  of  vari- 
ahiliztd  control  rules  in  recognition  memory  that  serves  as  generalized  plans  for 
classes  of  potential  goals.  Control  rules  are  different  from  operator  representation 
rules  described  in  the  previous  section  in  that  control  rules  generate  instantiated 


If  the  problem-space  is  bhcks-toorid  < 

A  Goal  protection  is  assumed  to  hold 
A  There  exists  an  active  goal  (on  <x>  <y>) 

A  There  exists  an  active  goal  (on  <y>  <z>) 

A  (on  <x>  <y>)  and  (on  <y>  <z>)  are  not  achieved 

A  <x>  and  <y>  are  blocks  ( 

A  <x>  and  <y>  are  clear 
A  The  proposed  operator  is  (MOVE  <x>  <y>) 

=► 

The  operator  is  worst 


Figure  3.6:  A  generalized  plw. 


preferences  to  help  select  the  current  operator  from  the  csmdidate  operators,  thus 
yielding  indirectly  a  sequence  of  operators.  The  second  structure  is  the  set  of  m- 
stantiated  preferences  in  preference  memory  that  serves  as  instantiated  plans  for 
active  goals.  The  instantiated  preferences  can  be  generated  either  by  the  general¬ 
ized  plans  or  as  the  results  of  subgoals  (that  is,  by  planning). 

Figure  3.6  shows  an  example  of  a  generidized  plan  for  a  set  of  problems  shown 
in  Figure  3.7(a).  This  rule  implies  that  if  goad  protection  is  assumed  to  hold,  one 
wants  a  stack  of  at  least  three  blocks,  neither  of  the  top  two  blocks  (out  of  the 
three)  are  in  position,  both  of  the  top  two  blocks  (out  of  the  three)  are  clean',  and 
an  operator  is  proposed  to  put  the  top  one  on  the  second  one,  then  that  operator  is 
worst.  Figure  3.7(b)  shows  the  sequence  of  steps  to  generate  a  sequence  of  operators 
for  a  four-block-stacking  problem.  For  each  step,  it  shows  the  current  state,  the 
goal  conjuncts  that  have  not  yet  been  achieved,  the  operators  proposed,  amd  the 
portion  of  the  instamtiated  plan  generated  from  the  generalized  plain  in  Figure  3.6. 
Figure  3.7(c)  then  shows  the  actual  operator  sequence  this  plan  generates. 

The  plan  representation  in  Soar  hais  many  interesting  aispects.  First,  the  pref¬ 
erence  language  has  an  imperative  construct  {best)  that  adlows  relatively  direct 
specification  of  the  next  au:tion  to  perform;  however,  it  adso  goes  beyond  this.  For 
example,  pairtiad  orders  cam  be  specified  by  using  binary  preferences  such  as  worse 
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Goal  OI:(oaAB) 
G2:(aaBC) 


Goal  Gl:(oaAB) 
G2:  (on  B  C) 
G3:(oaCD) 


Goal  01:  (on  A  B) 
G2:(onBC) 
G3:(onCD) 
G4:  (on  D  E) 


Unachievod  (on  a  b) 
Goal  Conjunctt  (on  B  Q 
(on  CD) 

Ptopoaad 
Oparatoff 


nri  idI  _ ^  Idi 


(ooAB) 

(ooBC) 


(MOVE  A  B) 
(MOVEBC) 


(onAB) 


(MOVE  A  B) 


Inatamiatad  (MOVE  A  B)  in  woral  (MOVE  A  B)  In  «w>nt 
Plan  (MOVEBC)  is  worni 


Figure  3.7;  The  plan  representation  in  Soar:  (a)  a  set  of  problems  which  are  solvable 
by  the  rule  in  Figure  3.6,  (b)  the  sequence  of  steps  for  a  four-block-stacking  problem, 
(c)  the  sequence  of  operators. 


and  better^  and  also  operator  avoidance  can  be  specified  by  using  worst  and  reject 
preferences.  Second,  the  use  of  control  rules  provides  a  fine-grained  conditionality 
and  context  sensitivity  that  allows  it  to  easily  encode  such  control  structures  as 
conditionals  and  loops.  In  addition,  the  variabilization  of  the  control  rules  allows 
a  single  plan  fragment  to  be  instantiated  for  multiple  related  decisions. 

3.1.4  Planning  Methods  in  Soar 

Although  the  plan  representation  in  Soar  is  different  from  conventional  plan  repre¬ 
sentations,  recent  work  on  a  Soar-based  framework  of  planning  has  demonstrated 
how  versions  of  such  standard  planning  methods  as  linear,  nonlinear,  and  abstrac¬ 
tion  planning  can  be  derived  by  adding  method  increments  that  include  core  means- 
ends  knowledge  about  what  operators  to  surest  for  consideration,  and  varying 
knowledge  about  how  to  respond  to  impasses  resulting  from  precondition  failures 
[Rosenbloom  et  ai,  1990]. 

Figure  3.8  illustrates  initial  traces  of  particular  versions  of  these  three  forms 
of  planning  as  implemented  in  Soar  for  Sussman’s  anomaly  (Figure  1.1)  in  the 
blocks-world.^  They  all  start  with  a  top-level  operator  that  is  to  achieve  the  entire 
conjunctive  goal  —  (and  (on  B  C)  (on  A  B))  —  directly  from  the  initial  state, 
and  reach  an  execution  impasse  if  there  is  no  information  about  how  to  do  this. 
In  response  to  this  impasse,  a  subgoal  is  created  where  means-ends  analysis  is 
used  to  generate  the  set  of  candidate  operators  —  (MOVE  B  C)  aind  (MOVE  A  B) 
—  that  are  known  to  potentially  be  able  to  achieve  any  of  the  goal  conjuncts.  A 
tie  impasse  then  occurs  unless  there  is  information  about  how  to  pick  among  them 
(or  unless  only  one  operator  is  generated).  In  this  tie  impasse,  a  look-ahead  search 
begins  by  selecting  one  of  the  alternatives  to  evaluate  —  here  it  is  (MOVE  A  B). 
Its  preconditions  are  tested  and  if  the  operator  is  known  to  be  applicable,  it  is 

^Abstraction  in  the  blocks  world  is  shown  for  comps'ison  purpose.  Although  abstraction  has 
not  actually  implemented  within  the  planning  framework  for  this  research,  it  has  been  imple¬ 
mented  in  Soar  by  Unruh  [1993], 
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Figure  3.8;  Planning  in  the  blocks-world  using  (a)  linear;  (b)  nonlinear;  and  (c) 
abstraction  planning. 


executed  to  create  a  new  state.  If  it  is  not  known  to  be  applicable  —  as  here  — 

what  happens  next  depends  on  the  planning  methods.  i 

With  abstraction,  the  operator  is  executed  anyway  and  problem  solving  just 
continues.  In  Figure  3.8(c),  for  example,  operator  (MOVE  A  B)  is  executed  even 
though  block  A  is  not  clear.  Without  abstraction,  as  in  Figure  3.8(a)  and  (b),  a  i 

new  set  of  goal  conjuncts  is  generated  from  the  operator’s  unmet  preconditions. 

The  difference  between  linear  and  nonlinear  planning,  at  least  for  these  versions, 
is  in  the  focus  of  operator  generation  from  the  new  goal  hierarchy.  Linear  planning  ^ 

shifts  focus  completely  to  the  new  conjunct  —  (clear  A)  as  in  Figure  3.8(a).  It 
stays  with  the  new  conjunct  until  it  is  achieved,  and  then  pops  back  to  the  original 
conjimct  that  led  to  the  precondition  violation.  Processing  shifts  to  one  of  its 
siblings  (if  there  are  any)  only  after  the  original  conjunct  is  achieved.  ( 

Nonlinear  planning  instead  shifts  focus  to  an  expanded  set  of  conjuncts  that 
includes  the  new  set  plus  the  original  set  minus  the  conjunct  that  led  to  the  precon¬ 
dition  violation,  yielding  (on  B  C)  and  (clear  A)  here  (Figure  3.8(b)).  At  any 
point  in  time  an  operator  can  be  selected  for  any  of  these  conjuncts,  enabling  op¬ 
erator  sequences  to  be  interleaved  as  necessary  (similar  to  the  casual-commitment 
approach  to  nonlinear  planning  (Veloso,  1989]).  For  both  planning  methods,  once 
the  new  focus  has  been  determined,  planning  continues  recursively  by  using  means- 
ends  analysis  to  generate  candidate  operators  from  the  new  goal  hierarchy. 

So  far,  we  have  been  referring  to  these  methods  as  “planning  methods”,  be¬ 
cause  they  are  versions  of  classical  methods  used  in  the  creation  of  plans.  With 
this  notion  in  hand,  the  question  to  be  asked  then  is  how  they  actually  yields  plans. 

As  mentioned  earlier,  a  plan  in  Soar  consists  of  a  set  of  plan  fragments  —  that  is, 
a  set  of  either  instantiated  preferences  or  generalized  control  rules.  Instantiated 
plan  fragments  are  generated  whenever  operator  preferences  are  created  in  work¬ 
ing  memory.  This  can  happen  simply  by  the  instantiation  of  a  generalized  plan 
fragment  (by  the  execution  of  a  control  rule)  or  as  a  result  of  projection  in  an 
operator-selection  subgoal. 
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In  projection,  one  or  more  operators  are  tried  out  in  look-ahead  search  to  see 
which  ones  lead  to  success  or  failure.  Success  engenders  best  preferences  and  failure 
engenders  worst  pr^erences.  For  example,  in  Figure  3.8(a)  a  best  preference  is 
returned  from  the  selection  subgoal  if  the  restilt  of  evaluating  (MOVE  A  B)  is  success, 
whereas  a  worst  preference  is  returned  if  the  result  is  failure.  These  preferences  act 
directly  as  fragments  of  a  plan  for  the  currently  active  goals.  In  addition,  whenever 
a  preference  is  returned  as  a  result  of  a  subgoal,  it  triggers  Soar’s  chunking  process, 
which  creates  and  stores  a  control  rule  that  acts  as  a  generalized  plan  fragment 
for  classes  of  problems.  These  relationships  are  summarized  by  the  following  two 
influence  paths. 


Planning  method  Projection  =>  Instantiated  plan 
Planning  method  =>  Projection  Learning  =>  Generalized  plan 


While  projection  plays  an  integral  role  in  determining  which  plans  are  created, 
what  is  projected  and  what  is  considered  to  be  success  or  failure  are  determined 
by  the  planning  method.  Within  this  framework,  planning  biases  are  implemented 
by  altering  the  planning  method,  which  then  determines  which  plans  are  created, 
through  the  influence  paths  above.  For  example,  a  protection  bias  is  implemented 
by  altering  the  planning  method  to  terminate  look-adiead  with  failure  any  time  a 
projected  path  leads  to  a  protection  violation.  In  comparison  to  the  same  planner 
without  this  bias,  the  protection  planner  will  lead  to  the  creation  of  worst  prefer¬ 
ences  (and  negative  control  rules)  which  will  avoid  paths  that  violate  protection. 


3.2  Implemented  Planning  Biases 

Within  the  context  of  Soar,  an  integrated  planning  system  has  been  constructed 
which  utilizes  a  set  of  different  methods.  These  methods  vary  in  the  amount  of 
bias  used.  The  planning  biases  that  have  been  concentrated  on  in  this  research  are 
directness,  linearity  and  protection.  Linearity  and  protection  are  chosen  because 
they  have  been  widely  used  in  the  planning  literature,  and  directness  is  chosen 
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[Goal  Hierarchy] 

Active  goal  ^ 

Pending  goal^ 

• 

No  Subgoal 

Local  Focus 

Global  Focus 

• 

^  (Directn^) 

(Linear) 

(Nonlinear) 

(a) 

(b) 

(0 

Figure  3.9:  Goal-flexibility  dimension. 


because  it  can  generate  an  efficient  plan  very  quickly  for  a  number  of  problems. 
These  three  biases  are  defined  along  two  bias  dimensions  —  goal  flexibility  and 
goal  protection. 

The  goal-flexibility  dimension  is  shown  in  Figure  3.9.  It  determines  the  degree  of 
flexibility  the  planner  has  in  generating  new  subgoals  and  in  shifting  the  focus  in  the 
goal  hierarchy.  This  dimension  subsumes  the  directness  and  linearity  biases.  The 
most  restricted  point  along  this  dimension  allows  no  generation  of  new  subgoals 
for  precondition  violations  (Figure  3.9(a)),  yielding  a  single-level  goal  hierarchy. 
This  implements  a  directness  bias  by  ensuring  that  each  of  the  operators  in  a  plan 
directly  achieves  an  initial  goal  conjunct,  rather  than  an  unmet  precondition  of 
another  operator. 

The  second  point  along  the  flexibility  dimension  allows  generation  of  new  sub¬ 
goals,  but  only  a  single  local  set  of  conjuncts  are  attended  to  at  any  point  in  time 
(Figure  3.9(b)).  This  local  focus  of  attention  has  two  main  consequences  for  the 
planner.  First,  it  reduces  the  branching  factor  of  the  planners’s  search  —  with 
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respect  to  the  nonlinear  planner  —  by  restricting  the  set  of  operators  that  the 
planner  can  consider  at  any  point  in  time  to  just  those  that  are  able  to  achieve  the 
local  conjuncts.  Second,  with  the  assumption  that  an  operator  achieves  only  one 
goal  conjunct  and  that  the  placement  of  operators  in  the  plan  is  restricted  within 
the  context  of  the  local  conjuncts  from  which  it  arose,  it  enforces  linearity  on  the 
resulting  plans  (thus  implementing  linear  planning)  by  ensuring  that  ators  for 
different  goal  conjuncts  cannot  be  interleaved  in  the  output  plans. 

The  third  point  along  the  flexibility  dimension  allows  the  global  use  of  subgoals; 
that  is,  new  goal  conjuncts  are  generated  for  unmet  preconditions,  and  operators 
are  simultaneously  considered  for  all  unsatisfied  conjuncts  (Figure  3.9(c)).  This  is 
the  least  restricted  version,  and  implements  nonlinear  planning  by  allowing  opera¬ 
tors  for  different  goal  conjuncts  to  be  interleaved,  as  in  NOLIMIT  [Veloso,  1989]. 

The  goal-protection  (GP)  dimension  is  shown  in  Figure  3.10.  The  two  points 
implemented  along  this  dimension  correspond  to  goal  protection  (Figure  3.10(a)) 
—  that  is,  every  achieved  top-level  goal  conjunct  is  protected  between  the  time 
it  is  achieved  and  the  time  it  is  no  longer  needed  —  and  to  no  goal  protection 


(Figtire  3.10(b)).  The  main  consequence  of  using  goal-protection  is  that  it  shrinks 
the  search  space  by  cutting  off  sequences  of  operators  which  violate  goal  protection. 

Figure  3.11  chauracterizes  the  3x2  set  of  planning  methods  derived  from  these 
bias  dimensions.  Eadr  of  the  cells  in  Figure  3.11  shows  a  label  representing  the 
planner  for  that  cell  along  with  a  problem  that  is  just  hard  enough  to  require  that 
planner;  that  is  the  problem  can  be  solved  optimally  by  the  planner  represented 
by  that  cell,  but  not  by  either  the  planner  to  its  left  or  the  planner  above  it.  The 
bottom-left  cell  represents  an  extended  blocks-world  problem  where  a  block  that  is 
second  from  the  top  of  a  tower  can  be  moved  [Etzioni,  1990a].  The  most  restricted 
planner  (Mi)  —  a  direct  goal-protection  planner  —  is  in  the  top-left  cell  of  the 
figure.  While  quite  restrictive,  it  is  sufiBcient  to  solve  the  block-stacking  problem 
shown  in  that  cell  of  the  figure.  The  least  restricted  planner  (Me)  —  a  nonlinear 
planner  without  goal  protection  —  is  in  the  bottom-right  cell  of  the  figure.  It  is  the 
only  planner  in  the  figure  capable  of  generating  an  optimal  solution  to  the  blocks- 
world  problem  shown  in  that  cell.  Between  these  two  extremes,  moving  up  or  to 
the  left  yields  more  bias,  while  moving  down  or  to  the  right  yields  less  bias.  In  each 
of  these  intermediate  cells,  the  problem  shown  is  one  that  is  just  hard  enough  to 
require  that  planner;  that  is,  the  problem  can  be  solved  optimally  by  the  planner 
represented  by  that  cell,  but  not  by  either  the  planner  to  its  left  or  the  planner 
above  it. 

Note  that  in  the  blocks-world  domsun,  both  Ms  and  Me  are  complete  planners 
in  that  they  can  potentially  solve  every  problem,  though  Ms  may  not  be  able 
to  generate  an  optimal  solution  for  some  problems,  However,  in  domains  with 
irreversible  operators  as  shown  in  Figure  1.2,  Ms  is  the  only  complete  planner. 

Figure  3.12  compares  the  traces  of  these  methods  for  Sussman’s  anomaly.  They 
all  start  with  a  combination  of  the  initial  state,  the  entire  conjunctive  goal  —  (and 
(on  B  C)  (on  A  B))  —  and  the  initial  set  of  candidate  operators  —  (MOVE  B 
C)  and  (MOVE  A  B)  —  which  are  generated  by  means-ends  analysis.  If  there  is 
no  information  about  which  operator  to  select,  a  tie  impasse  occurs.^  In  this  tie 

’For  simplicity  of  presentation,  these  traces  only  show  tie  impasses. 
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Figure  3.11:  The  planning  methods  generated  by  the  bias  dimensions. 


43 


(a)  The  initial  state,  goals,  and  op^ots 


(b)  Directness  &  protection  (M|) 


•••  Prolectioa 
fiilioe 


(c)  linear  A  protection  (M2) 


(d)  Nonlinear  &  protection  (Mj) 


(e)  Linear  &  no  protection  (M5) 


Figure  3.12:  Planning  in  Soar. 


impasse,  a  look-ahead  search  begins  by  sele>;).ing  one  of  the  alternatives  to  evaluate 
—  here  it  is  (MOVE  A  B). 


If  the  directness  bias  is  used  —  as  in  Figure  3.12(b)  —  the  evaluation  of  (MOVE 
A  B)  is  terminated  immediatedly  with  failure  as  the  evaluation  value,  and  the  other 
operator  (MOVE  B  C)  is  selected.  If  the  directness  bias  is  not  used,  a  new  set  of  goal 
conjuncts  are  generated  from  the  operator’s  unmet  preconditions  (Figure  3.12(c- 

e)). 


Linear  planning  focuses  on  the  new  conjunct  —  (clear  A)  as  in  Figure  3.  12(c) 
and  (e)  —  until  it  is  achieved,  and  then  returns  to  the  original  conjunct  that 
led  to  the  impasse  —  here,  (on  A  B).  Sibling  conjuncts  —  here,  (on  B  C)  — 
are  considered  only  after  the  original  conjunct  is  achieved.  In  this  problem,  this 
eventually  leads  to  fulure  if  a  protection  bias  is  used  (Figure  3.12(c)),  or  generates 
a  non-optimal  plan  if  a  protection  bias  is  not  used  (Figure  3.12(e)).  Nonlinear 
planning  instead  shifts  focus  to  the  entire  set  of  goal  conjuncts  (except  the  one 
that  led  to  the  impasse)  —  (and  (on  B  C)  (clear  A)  as  in  (Figure  3.12(d)). 
This  eventuaUy  can  yield  an  optimal  plan  for  this  problem  regardless  of  the  use  of 
a  protection  bias. 


•  3.3  Learning  in  Single-Method  Planners 

For  each  of  the  single-method  planners,  chunking  is  performed  over  the  planner’s 
projection  (look-ahead)  process:  the  elements  to  be  explained  are  the  preferences 

#  generated  during  projection,  and  the  explanations  are  the  traces  of  the  projections 
that  led  to  the  preferences.  Both  positive  rules  and  negative  rules  can  be  learned 
from  projections.  Figures  3.13  and  3.15  provide  a  simple  example  of  this. 

^  Figure  3.13  shows  a  path  projected  by  the  nonlinear  planner  for  a  simple  four- 

block-unstacking  problem.  This  projection  proceeds  through  multiple  tie  impasses 
until  the  problem  is  successfully  solved.  In  this  example,  (MOVE  A  Table)  is  eval¬ 
uated  in  the  first  operator-selection  subgoal,  and  (MOVE  B  Table)  is  evaluated  in 
^  the  second  operator-selection  subgoal.  As  shown  in  Figure  3.14,  this  results  in  a 
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pair  of  positive  control  rules,  one  for  each  correct  decision  on  the  solution  path. 
Evaluating  an  operator  which  is  not  directly  applicable  in  the  current  state  —  here 
(MOVE  B  Table)  or  (MOVE  C  Table)  in  the  first  operator-selection  subgoal  —  also 
leads  to  success  in  nonlinear  planning,  though  the  learned  rules  are  more  complex. 

Figure  3.15  shows  a  path  projected  with  a  directness  bias  for  the  same  block¬ 
unstacking  problem.  In  contrast  to  the  previous  case,  the  projection  is  terminated 
with  failure  as  soon  as  the  non-appli cable  operator  (MOVE  B  Table)  is  selected.  As 
shown  in  Figure  3.16,  this  yields  a  negative  control  rule  for  the  incorrect  decision 
on  the  solution  path. 
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Chiiiik-2:  If  the  pioblem-space  is  block$~%eorld 

A  There  exists  an  active  goal  (on  <y>  Table) 

A  There  exists  an  active  goal  (on  <z>  Table) 

A  (on  <y>  Ihble)  and  (on  <t>  Table)  are  not  achieved 
A  <y>  is  on  <z> 

A  <y>  is  dear 

A  The  proposed  operator  is  (MOVE  <y>  Tkble) 

=» 

The  operator  is  best 

(^) 

Chunk-4:  If  the  problem-space  is  blocka-worid 

A  There  exists  an  active  goal  (on  <x>  Thble) 

A  There  exists  an  active  goal  (on  <y>  Table) 

A  There  exists  an  active  goal  (on  <z>  Table) 

A  (on  <x>  Table),  (on  <y>  Table),  and  (on  <z>  Table)  are  not  achieved 
A  <x>  is  on  <y> 

A  <y>  is  on  <z> 

A  <x>  is  dear 

A  The  proposed  operator  is  (MOVE  <x>  Table) 

=► 

The  operator  is  best 

(b) 

Figure  3.14:  Learned  rules  for  block  unstacking  with  nonlinear  planning. 


Note  that  if  the  planner’s  bias  is  reflected  in  an  altered  planning  method,  which 
in  turn  3delds  an  altered  projector,  then  the  planner’s  bias  can  indirectly  induce 
a  bias  in  the  resulting  learning  process.  For  example,  the  rules  in  Figure  3.14  are 
relatively  specialized,  because  each  must  encapsulate  the  entire  explanation  for 
why  a  particular  operator  will  eventually  lead  to  success.  In  lau-ger  problems  these 
explanations  get  even  larger,  and  the  rules  end  up  being  even  more  specialized. 

On  the  other  hand,  the  explanation  for  the  rule  in  Figure  3.16  is  quite  short  — 
based  as  it  is  on  the  explicit  assumption  that  directness  can  hold  and  on  the  failure 
of  the  first  selected  operator  to  be  applicable.  As  it  turns  out,  this  single  rule  is 


Figure  3.15:  Four  block  unstacking  with  directness. 


Chunk-6:  If  the  problem-space  is  blocks-world 
A  Directness  is  assumed  to  hold 
A  There  exists  an  active  goal  (on  <y>  Table) 

A  (on  <y>  Table)  is  not  achieved 
A  <y>  is  not  clear 

A  The  proposed  operator  is  (MOVE  <y>  Table) 
The  operator  is  worst 


Figure  3.16:  A  learned  rule  for  block  unstacking  with  directness. 


general  enough  to  handle  the  entire  problem,  by  removing  from  consideration  all 
operators  that  attempt  to  move  unclear  blocks  onto  the  table.  The  bias  in  this  case 
has  th\is  yielded  faster  planning  and  learning  —  because  of  shorter  projections  and 
explanations  —  and  has  resulted  in  the  acquisition  of  fewer,  more  general  rules. 

Implicit  in  this  example  is  one  approach  to  producing  generalization  to  N  [Bost- 
rom,  H.,  1990,  Cohen,  1988,  Shavlik,  1989,  Subramanian  and  Feldman,  1990],  where 
a  plan  learned  for  a  problem  of  a  particular  size  can  transfer  to  solve  problems 
with  the  same  structure  but  of  arbitrary  size  [Rosenbloom  et  al.,  1993).  Without 
directness,  the  control  rules  are  spedfic  to  particular  numbers  of  blocks,  and  thus 
can  only  be  used  to  directly  solve  terminal  subregions  of  larger  problems.  However, 
with  directness,  a  single  rule  is  learned  that  removes  from  consideration  at  each 
decision  all  operators  that  move  unclear  blocks  to  the  table,  no  matter  how  many 
unclear  blocks  there  are.  This  idea  can  be  applied  to  other  problems  and  bi^kses 
as  well.  Figure  3.17,  for  example,  shows  a  path  projected  with  protection  for  a 
four-block'Stacking  problem.  As  with  the  directness  bias  in  block  unstacking,  a 
protection  bias  leads  here  to  learning  a  single  negative  rule  (Figure  3.18)  that  can 
be  applied  to  stacking  problems  of  arbitrary  size. 

A  third  type  of  bias  that  can  also  induce  generalization  to  N  is  complete  pro¬ 
tection.  Complete  protection  is  a  variant  on  goal  protection  that  provides  a  very 
strong  bias  by  not  only  protecting  established  goals,  but  also  protecting  established 
operator  sequences.  That  is,  it  disallows  any  backtracking  on  operator  selection, 
thus  letting  projection  be  terminated  with  success  whenever  an  operator  is  se¬ 
lected,  rather  than  waiting  until  the  entire  problem  has  been  solved.  As  with  the 
directness  example,  projection  is  terminated  here  after  the  first  operator  is  selected 
(Figure  3.19(a)).  However,  in  this  case  it  is  terminated  with  success  as  soon  as  the 
top  block  is  moved  to  the  table.  The  explanation  for  this  success  depends  only  on 
the  explicit  assumption  of  complete  protection  and  on  the  fact  that  the  operator 
was  successfully  applied,  so  a  relatively  general,  positive  control  rule  is  learned 
(Figure  3.20).  Although  this  is  a  positive  rule,  it  also  turns  out  to  produce  gener¬ 
alization  to  N,  but  now  by  always  specifying  that  the  one  clear  block  that  is  not 
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Figure  3.17.'  Four  block  stacking  with  protection 


Chunk-S:  If  the  problem-space  is  blocks-world 

A  Goal  protection  is  assumed  to  hold 
A  There  exists  an  active  goal  (on  <x>  <y>) 

A  There  exists  an  active  goal  (on  <y>  <z>) 

A  (on  <x>  <y>)  and  (on  <y>  <z>)  are  not  achieved 
A  <x>  and  <y>  are  blocks 
A  <x>  and  <y>  are  clear 
A  The  proposed  operator  is  (MOVE  <x>  <y>) 

=>■ 

The  operator  is  worst 

Figure  3.18;  A  learned  rule  for  four  block  stacking  with  protection. 
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Figure  S.19:  Three  and  five  block  unstacking  with  complete  protection:  (a)  a 
projected  path,  (b)  transfer  of  the  learned  rule  to  a  different  number. 
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Chunk- 10:  If  the  problem-space  is  blocks-world  4 

A  Complete  protection  holds 
A  There  exists  an  active  goal  (on  <x>  Table) 

A  (on  <x>  Table)  is  not  achieved 
A  <x>  is  clear 

A  The  proposed  operator  is  (MOVE  <x>  Ib.ble)  I 

=► 

The  oper  tor  is  best 


Figure  3.20:  A  learned  rule  for  four  block  unstacking  with  complete  protection. 


4 


already  on  the  table  —  if  it  were  already  on  the  table,  there  would  be  no  active  goal 
conjunct  for  it  —  should  be  moved  to  the  table.  The  resulting  rule  can  transfer  to 
any  number  of  iterations,  as  shown  in  Figure  3.19(b).  ^ 

The  key  to  producing  generalization  to  N  with  these  biases  is  that  they  enable 
learning  from  non-iterative  paths  —  in  this  way  it  is  similar  to  Etzioni’s  [1990a] 
work  on  restricting  EBL  to  learn  from  only  non-recursive  paths.  In  the  directness  I 

and  protection  cases,  the  success  paths  are  iterative,  but  (negative)  rules  can  in¬ 
stead  be  learned  from  non-iterative  failure  paths.  In  the  complete-protection  case, 
learning  occurs  from  a  fragment  of  the  success  path  that  corresponds  to  just  a  sin¬ 
gle  cycle  of  iteration.  In  both  cases,  the  resulting  rules  can  transfer  to  any  number 
of  iterations. 


3.4  Experimental  Results 

Experimental  results  from  the  six  planners  in  two  planning  domains  —  the  blocks- 
world  domain  and  the  machine-shop  scheduling  domain  —  are  shown  in  Tables  3.1  - 
3.3.  The  data  comes  from  running  each  planner  on  the  same  set  of  100  problems  for 
each  domain.  For  each  problem  in  the  blocks-world  domain,  the  number  of  blocks 
was  randomly  selected  between  three  and  four.  Given  the  number  of  blocks,  an 
initial  state  was  randomly  generated  among  the  possible  configurations  of  the  blocks 
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and  the  table  (3  configurations  for  3  blocks,  and  5  configurations  for  4  blocks). 
The  generated  initial  state  was  represented  as  a  set  of  (on  Xi  y«)'type  predicates. 
Likewise  a  set  of  (on  xj  y>)*type  goal  conjuncts  was  randomly  generated  that 
numbered  between  two  and  the  number  of  blocks  in  the  initial  state.  For  each  goal 
conjunct,  xj  was  selected  randomly  from  the  initial  set  of  blocks,  and  then  yj  was 
selected  randomly  from  among  the  t  ’  e  blocks  which  have  not  yet  been 

selected  as  yjt  (k  <  j).  The  number  of  possible  combinations  of  goal  conjimcts  for 
n-block  problems  is  0(n”),  because  for  each  of  the  n  blocks,  there  are  n  possible 
locations. 

A  task  in  the  machine-shop  scheduling  domain  is  to  determine  a  sequence  of 
machining  operations  to  produce  the  desired  objects  so  as  to  meet  the  given  re¬ 
quirements  [Minton,  1988].  *  The  shop  contuns  several  machines,  including  ROLL, 
LATHE,  PUNCH,  DRILL-PRESS,  POLISH,  GRINDER,  SPRAY-PAINT,  and  IMMERSION- 
-PAINT.  Each  object  has  five  attributes  —  shape,  has-hole,  surface-condition, 
painted,  and  temperature.  Each  attribute  can  have  one  of  two  to  four  types  of 
values.  For  each  problem,  the  initial  state  was  generated  by  assigning  a  randomly 
generated  type  to  each  attribute  for  an  object  (except  that  the  initial  tempera¬ 
ture  is  always  cold).  The  number  of  goal  conjuncts  for  each  problem  was  fixed  as 
five.  The  goal  conjuncts  for  each  problem  were  generated  randomly  as  with  the 
initial-state  generation. 

Learning  was  turned  on  for  each  problem,  but  only  within-trial  transfer  was 
allowed;  that  is,  rules  learned  during  one  problem  were  not  used  for  other  problems. 
Planning  time  is  mainly  measured  in  terms  of  decisions,  the  basic  behavioral  cycle 
in  Soar.  This  measure  is  not  quite  identical  to  the  more  traditional  measure  of 
number  of  planning  operators  executed,  but  should  still  correlate  with  it  relatively 
closely. 


^The  version  of  the  machine-shop  domain  used  in  this  research  is  almost  identical  to  the  original 
PRODIGY  version  presented  in  [Minton,  1988].  The  only  difference  between  the  two  versions  is 
that  the  time  augmentation  for  each  generated  operation  in  the  original  version  is  not  specified 
in  our  version,  because  our  main  focus  here  is  on  the  sequence  of  operations  rather  than  the  time 
when  to  execute  the  operations. 
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No  subgoal  Local  Global 

(Directness)  (Linear)  (Nonlinear) 


(a)  Blocks  world  domain. 


No  subgoal  Local  Global 

(Directness)  (Linear)  (Nonlinear) 


GP 


No  GP 


(b)  Machine-shop  scheduling  domain. 


Ml 

M2 

M3 

70  {Ai) 

70  (.42) 

70  (.43) 

M4 

Ms 

Me 

100  (A4) 

100  (>45) 

100  {Ae) 

Table  3.1:  Number  of  problems  solved. 


No  8ttbgoal  Local  Gk>bal 

(Directness)  (Linear)  (Ncnlinear) 


GP  Ai{A4) 

Aj 

A3 

AsiAe) 


NoGP  Ai(A4) 
A2 
A3 

AsiAe) 


No  snbgoal  Local  Global 

(Directness)  (Linear)  (Nonlinear) 


GP  Ai{A2,  As) 
A4(A6,  i4«) 


NoGP  Ai{A2,A3) 

A4(A5,  j4s) 


Ml 

M2 

M3 

20.73  (0.49) 

20.73  (0.49) 

20.73  (0.49) 

Ma 

Mi 

Me 

31.47  (0.85) 

31.47  (0.85) 

31.47  (0.85) 

33.97  (0.92) 

33.97  (0.92) 

33.97  (0.92) 

Ml 

M2 

M3 

8.63  (0.17) 

15.06  (0.41) 
22.67  (0.92) 

16.24  (0.43) 
23.66  (0.73) 
23.60  (0.73) 

Ma 

Ms 

Me 

12.34  (0.29) 

22.21  (0.67) 
29.41  (1.06) 
29.48  (1.06) 

29.22  (1.04) 

33.40  (2.19) 
47.12  (3.66) 
48.06  (3.68) 
47.93  (3.62) 

(a)  Blocks  world  domain. 


(b)  Machine-shop  scheduling  domain. 


Table  3.2:  Average  number  of  decisions  (and  CPU  time  (sec.))  per  problem. 


No  subgoal  Local  Global 

(Directness)  (Linear)  (Nonlinear) 


Ml 

Mi 

M3 

GP  Ai(A4) 

1.82 

1.85 

2.00 

Aj 

- 

2.35 

2.54 

A3 

- 

- 

2.54 

AiiA^) 

- 

- 

- 

M4 

Mi 

Me 

NoGP  Ai(A4) 

1.82 

3.00 

2.90 

Aj 

- 

3.78 

3.88 

A3 

- 

3.83 

4  07 

A5(A«) 

- 

3.82 

4.14 

(a)  Blocks  world  domain. 


No  subgoal  Local  Global 

(Directness)  (Linear)  (Nonlinear) 


Ml 

Mi 

M3 

GP  Ai(A2,  A3) 

2.43 

2.43 

2.43 

A4(A5,  Ae) 

- 

- 

- 

M4 

Mi 

Me 

NoGP  Ai(Aj,A3) 

4.13 

4.13 

4.13 

A4(As,  Ae) 

4.47 

4.47 

4.47 

(b)  Machine-shop  scheduling  domain. 


Table  3.3;  Average  plan  length  per  problem. 


T&ble  3.1  shows  the  numher  of  problems  solved  by  each  cell’s  planner,  as  defined 
in  Figure  3.11,  for  these  two  domains.  The  label  Ai  denotes  the  problem  set  that 
the  method  Mi  implicitly  defines.  With  a  sufficient  time  limit,  every  problem 
solvable  in  principle  by  a  planner  was  actually  solved.  Not  surprisingly,  the  data 
show  a  monotonic  relationship  between  planner  bias  and  scope,  from  a  low  of 
68  problons  in  the  blocks-world  domain  and  70  problems  in  the  machine-shop 
scheduling  domain  for  the  most  restricted  planner  to  a  high  of  100  problems  in 
both  domains  for  the  least  restricted  planner. 

Tables  3.2  and  3.3  show  the  average  number  of  decisions,  average  CPU  time, 
and  average  plan  lengths  —  which  should  positively  correlate  with  execution  time 
—  for  each  distinct  problem  sets  defined  in  Table  3.1.  In  the  standard  blocks-world 
domain,  four  distinct  problem  sets  are  defined.  This  is  because  A*  is  the  same  as 
Ai  since  if  a  problem  is  not  solvable  with  protection,  it  also  is  not  solvable  with 
directness;  and  As  is  the  same  as  A«  since  both  Ms  and  Me  are  complete  in  this 
domain,  though  Ms  may  not  be  able  to  generate  an  optimal  solution.  These  four 
problem  sets  are  associated  with  the  four  rows  within  each  cell.  In  the  machine- 
shop  scheduling  domain,  no  precondition  subgoals  are  required  because  there  is 
no  operator  which  achieves  any  of  the  unmet  preconditions.  Thus  both  directness 
and  linearity  are  irrelevant.  However,  there  are  strong  interactions  among  the 
operators,  so  protection  violations  are  still  relevant.  In  consequence,  two  distinct 
problems  sets  are  defined  A,  and  A4. 

The  timing  results  are  shown  in  Table  3.2.  The  two  columns  within  each  cell 
show  the  average  number  of  decisions  and  the  average  CPU  time,  respectively, 
which  are  required  to  generate  plans  for  the  problems.  The  table  shows  that  plan¬ 
ning  effort  is  also  a  monotonically  decreasing  function  of  the  amount  of  bias  along 
these  dimensions  (only  for  protection  in  the  machine-shop  scheduling  domain).  For 
example,  for  problem  set  A\  in  the  blocks-world  domain,  effort  ranged  from  a  low 
of  8.63  decisions  for  the  most  biased  method  (that  is,  the  direct  goal-protection 
method)  to  a  high  of  33.40  decisions  for  the  least  biased  method  (that  is  nonlinear 


piAiming  without  goal-protection).  This  trade  off  between  eflBciency  and  complete¬ 
ness  implies  that  selecting  an  appropriate  amount  of  bias  for  a  given  problem  is 
critical  for  finding  a  solution  quickly.  Table  3.3  exhibits  a  similar  monotonic  rela¬ 
tionship  betwera  plan  length  and  the  amount  of  bias  used. 


3.5  Summary 

Six  single  methods  are  defined  along  two  bias  dimensions:  goal-flexibility,  and  goal- 
protection.  These  methods  are  implemented  in  Soar,  in  which  generated  plans  are 
represented  as  sets  of  control  rules  that  jointly  specify  which  operators  should 
be  executed  at  each  point  in  time.  The  six  implemented  methods  are  compared 
empirically  in  terms  of  planner  completeness,  planning  time,  and  plan  length.  The 
experimental  results  show  a  trade-off  between  completeness  and  efficiency.  This 
implies  that  the  planning  system  would  be  best  served  if  it  could  always  opt  for 
the  most  restricted  method  adequate  for  its  current  situation. 
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Chapter  4 


Multi-Method  Planners 


One  of  the  main  problems  with  the  planners  examined  in  the  previous  chapter  is 
that  each  is  either  incomplete  or  performs  a  significant  amount  of  excess  work  for 
some  of  the  problems  (both  in  planning  and  execution).  An  alternative  approach 
is  to  build  a  multi-method  planner  which  utilizes  a  coordinated  set  of  planning 
methods,  where  each  individual  method  has  different  scope  and  performance.  The 
basic  idea  underlying  this  thesis  is  to  select  and  coordinate  a  set  of  individual 
methods  based  on  the  empirical  performance  of  those  methods  for  a  training  set 
of  problems. 

Within  the  empirical  multi-method  planning  framework,  the  main  goal  of  this 
research  is  to  create  a  set  of  multi-method  planners  which  are  more  efficient  and  ap¬ 
plicable  than  single-method  planners.  The  previous  chapter  introduced  a  method¬ 
ology  to  create  individual  methods  which  have  different  performance  and  scope 
based  on  the  amount  of  bias  used.  Given  a  set  of  created  methods,  the  key  issue 
is  then  how  to  coordinate  the  methods  in  an  efficient  manner  so  that  the  multi¬ 
method  planner  can  have  high  performance.  Method  coordination  refers  to  (1) 
the  selection  of  appropriate  methods  as  situations  arise,  and  (2)  the  granularity  of 
method  switching  as  the  situational  demands  shift. 

For  method  selection,  individual  methods  need  to  be  organized  so  that  a  higher 
level  control  structure  can  determine  which  method  to  use  first,  and  which  method 
to  use  next  when  the  current  method  fails.  Two  straightforward  ways  of  organizing 
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individual  methods  are  a  sequential  and  a  time-shared  manner.  A  sequential  multi¬ 
method  planner  consists  of  a  sequence  of  single-method  planners.  A  time-shared 
multi-method  planner  consists  of  a  set  of  single-method  planners  in  which  each 
method  is  active  in  turn  for  a  given  time  slice  [Barley,  1991).  In  this  thesis,  a 
special  type  of  sequential  multi-method  planning,  called  monotonic  muHi-method 
planning  is  focused  on.  In  a  monotonic  multi-method  planner,  the  single  methods 
are  sequenced  according  to  increasing  coverage  and  decreasing  efficiency,  ^th  the 
assumptions  that  earlier  methods  terminate,  and  that  methods  which  are  efficient 
when  they  succeed  do  not  waste  too  much  time  when  they  fail,  monotonic  multi¬ 
method  planners  can  generate  plans  efficiently  by  using  more  restricted  methods 
earlier  in  the  sequence  [Lee  and  Rosenbloom,  1992]. 

One  way  to  construct  a  monotonic  multi-method  planner  is  to  use  the  biases 
which  themselves  increase  efficiency.  Individual  methods  are  sequenced  so  that 
the  set  of  biases  used  in  a  method  is  a  subset  of  the  biases  used  in  earlier  meth¬ 
ods,  and  the  later  methods  have  more  coverage  than  the  earlier  methods.  This 
means  that  planning  starts  by  trying  highly  efficient  methods,  and  then  succes¬ 
sively  relaxing  biases  until  a  sufficient  method  is  found.  This  type  of  planning 
is  called  bias-relaxation  mnlti-method  planning.  A  bias-relaxation  multi-method 
planner  is  not  necessarily  a  monotonic  multi-method  planner  if  there  are  interac¬ 
tions  among  biases.  However,  one  can  generate  monotonic  multi-method  planners 
via  bias-relaxation  by  just  testing  whether  monotonicity  holds  for  the  created  bias- 
relaxation  multi-method  plzmners.  In  bias-relaocation  multi-method  planning,  each 
bias  is  evaluated  independently  by  comparing  a  method  which  uses  that  bias  only 
and  a  method  which  uses  no  bias.  Thus,  bias-relaxation  multi-method  planning  has 
more  restricted  scope  in  creating  and  comparing  individual  methods  than  mono¬ 
tonic  multi-method  planning^. 


^Strongly-monotonic  multi-method  planning  described  in  [Lee  and  Rosenbloom,  1993]  also 
uses  a  bias-relaxation  scheme.  However,  it  evaluates  each  bias  along  with  other  biases,  thus 
examining  more  methods  than  bias-relaxation  multi-method  planning. 
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The  second  issue  of  method  coordination  is  the  granularity  at  which  individual 
methods  are  switched.  This  issue  is  import2Lnt  in  terms  of  a  planner’s  perfor¬ 
mance,  because  the  performance  of  a  multi-method  planner  can  be  changed  ac¬ 
cording  to  the  granularity  of  shifting  control  from  method  to  method.  Depending 
on  the  granularity  of  method  switching,  multi-method  planners  can  be  further  spe¬ 
cialized:  coarse-graintd  muHi-method  planners,  where  methods  are  switched  on  a 
problem-by-problem  basis;  and  fine-grained  multi-method  planners,  where  methods 
are  switdied  on  a  goal-by-goal  basis  [Lee  and  Rosenbloom,  1993]. 

This  chapter  investigates  these  two  issues  of  coordinating  individual  methods 
in  multi-method  planning  in  depth.  To  investigate  the  method  organization  is¬ 
sue,  a  scheme  to  construct  monotonic  multi-method  planners  from  a  set  of  single¬ 
method  planners  is  provided,  and  then  a  formal  model  is  presented  to  compare  the 
performance  of  constructed  monotonic  multi-method  planners  with  time-shared 
multi-method  planners  and  single-method  planners.  Also,  a  scheme  to  construct 
a  set  of  bias-relaxation  multi-method  planners  is  provided,  and  the  constructed 
bias-relaxation  multi-method  planners  are  compared  experimentally  with  single¬ 
method  planners.  To  investigate  the  granularity  of  method  switching,  the  per¬ 
formance  of  coarse-grained  bias-relaxation  multi-method  planners  and  fine-grained 
bias-relaxation  multi-method  planners  (called  simply  coarse-grained  multi-method 
planners  and  fine-grained  multi-method  planners,  respectively,  throughout  this  the¬ 
sis)  are  evaluated  experimentally,  and  compared  with  the  performaince  of  single¬ 
method  planners. 

Partial-order  planning  is  one  of  the  most  popular  approaches  in  the  planning 
literature.  At  the  end  of  this  chapter,  multi-method  planning  is  compared  with 
partial-order  planning  in  terms  of  planning  performance. 


4.1  Monotonic  Multi-Method  Planners 

In  a  monotonic  multi-method  planner,  individual  methods  are  sequenced  so  that 
the  earlier  methods  are  more  efficient  and  have  less  coverage  than  the  later  methods. 


61 


The  idee  is  that  if  the  biases  used  in  efficient  methods  can  prune  the  search  space, 
the  problems  solvable  by  efficient  methods  should  be  solved  more  quickly,  while 
problems  requiring  less  bias  should  not  waste  too  much  extra  time  trying  out  the 
insufficient  early  methods. 

This  approach  is  inspired  by  iterative  deepening  [Korf,  1985].  In  iterative  deep¬ 
ening,  a  sequence  of  depth-first  searches  are  performed,  each  to  a  greata  depth 
than  the  previous  one.  If  a  solution  is  found  at  a  shallow  depth,  the  cost  of  search¬ 
ing  to  a  greater  depth  is  saved.  If  a  solution  is  not  found  at  a  particular  depth,  a 
deeper  search  is  performed.  The  cost  of  doing  the  shallower  searches  is  then  wasted, 
but  since  the  deeper  search  costs  at  least  times  the  cost  of  the  shallower  search 
—  where  fi  is  the  branching  factor  of  the  search  tree  —  this  cost  can  be  relatively 
quite  small.  Thus,  if  the  proportion  of  problems  solvable  at  shallow  depths  is  large 
enough,  and  the  ratio  of  costs  for  successive  levels  is  large  enough,  there  should  be 
a  net  gain. 

A  monotonic  multi-method  planner  can  be  defined  formally  by  using  a  rtstricitd 
dominance  relation  [Lee  and  Rosenbloom,  1992). 

4.1.1  Restricted  Dominance  Relation 

Let  Mi  be  a  single-method  planner.  Let  A  be  a  sample  set  of  problems,  and  let 
A,-  C  A  be  the  subset  of  A  which  is  solvable  in  principle  by  A/,.  The  functions 
s(Mi,As)  and  l(Mi,As)  represent  respectively  the  average  cost  that  Mi  requires 
to  succeed  and  the  average  length  of  plans  generated  by  Mi,  for  the  problems  in 
A5  C  Aj.  Similarly,  f{Mi,AF)  represents  the  average  waisted  cost  for  Mi  to  fail  for 
the  problems  in  Ajr  C  A  —  A,. 

Given  a  set  of  methods  {A/,}  (*=1, ...,  n),  a  restricted  dominance  relation  Mx  ■< 
My  is  defined  between  two  different  single-method  planners,  Af*  and  My,  if  the 
following  conditions  hold: 

(1)  A,CA„ 

(2)  s{Mx,Ai)  <  s{My,Ai),  for  every  A,  C  A, 
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(b)  Macbiae-sbop  scbeduling  domain. 


Table  4.1:  The  performance  of  the  six  single-method  planners  for  the  problem  sets 
defined  by  the  scopes  of  the  planners. 


(3)  <  /(Afy,i4,)>  for  every  A,  C  A*. 

A  sequential  multi-method  planner  which  consists  of  n  different  single-method 
planners  is  denoted  as  Afjb,— A  sequential  multi-method  planner 
Afkj— +...—♦  Affc„  is  called  monofontc  if  Mjbi  -<  holds  for  each  i  =  l,...,n— 


The  straightforward  way  to  build  monotonic  multi-method  planners  is  to  run 
each  of  the  individual  methods  on  a  set  of  training  problems,  and  then  from  the 
resulting  data  to  generate  method  sequences  for  which  monotonicity  holds.  Ta¬ 
ble  4.1  shows  the  average  number  of  decisions,  s(A/fcj,  A*^),  amd  the  average  plan 
lengths,  /(A/*.,  A*y),  over  a  training  problem  set  for  the  six  single-method  planners 
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defined  in  Chapter  3,  for  the  blocks- world  domain  and  the  machine-shop  scheduling 
domain. 

For  each  domain,  the  problem  set  consists  of  30  problems  which  are  randomly 
generated  as  in  Chapter  3.  In  the  blocks-world  domain,  A2  and  ^3  are  different  sets 
in  principle,  because  problems  such  as  Sussman’s  anomaly  cannot  be  solved  by  a 
linear  planner  with  protection  (Afj)  but  can  be  by  a  nonlinear  planner  with  protec¬ 
tion  (M3).  However,  among  the  30  training  problems,  these  “anomaly”  problems 
did  not  occur,  yielding  =  ^3  for  this  set  of  problems. 
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Figure  4.1  exhibits  restricted  dominance  graphs  based  on  the  results  in  Ta¬ 
ble  4.1.  £)ach  node  in  a  graph  represents  a  single- method  planner,  and  an  arc 
from  Mg  to  My  implies  that  Mg  •<  My  holds.  Thus  every  path  in  the  graph 
corresponds  to  a  monotonic  multi-method  planner.  A  monotonic  multi-method 
planner  Af*,— »...—» A/fc„  is  complete,  if  A/*,  is  complete.  In  the  blocks- world 
domain,  seven  complete  2-method  planners  and  four  complete  3-method  planners 
can  be  constructed,  whereas  in  the  machine-shop  scheduling  domain,  nine  complete 
2-method  planners  can  be  constructed. 

The  next  section  compares  a  monotonic  multi-method  planner  with  its  corre¬ 
sponding  tinie-shared  multi-method  planner  and  single-method  planner  in  terms 
of  planning  time  and  plan  length. 


4.1.2  Performance  Analysis 

Planning  time:  In  this  section,  it  is  assumed  that  the  individual  methods  in 
a  sequential  multi-method  planner  are  switched  on  a  problem-by-problem  basis. 
For  a  given  problem  a  €  A,  the  planning  time  of  a  sequential  multi-method  plan¬ 
ner  Af*j— ♦Afjfcj -+...— »Af*„,  where  Af*^  is  the  first  method  which  solves  a,  can  be 
represented  as 


s(A/*,-.A/*,^...-.A/*„,  {«})  =  s{Mk„{°})  +  L 

i=i 


(4.1) 


where  s{Mki,  {<*})  i®  ^ki  to  solve  c,  and  23)=!  /(Af*,,  {a})  is  the  sum 

of  the  costs  for  inappropriate  earlier  methods  to  fail  for  a. 

The  corresponding  time-shared  multi-method  planner  consists  of  the  same  set 
of  single  methods,  denoted  as  A/*,  ||A/*3||...!jA/*„.  Let  Af*.  be  the  first  method  that 
solves  a  in  a  horse-race  manner.  Suppose  that  the  switching  in  a  time-shared  multi¬ 
method  planner  is  based  on  a  unit  time  slice.  Then,  the  expected  planning  time  of 
Mki  l|Af;tj||...||Afjt„  for  a  problem  a  €  A  can  be  represented  as 

s(AffcJ|A/fc,||...||Affc„,{a})  =  s{Mk„{a})-^-  {a}),s(A/i.,  {a})], 

(4.2) 
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where  the  first  term  is  the  cost  for  the  method  that  actually  solves  a,  and  the 
second  term  is  the  sum  of  the  costs  for  the  rest  of  the  methods  either  to  fail  for  a 
(/(Mk^,{a}))  or  to  try  to  solve  a  {<*})). 

The  average  planning  time  for  a  problem  set  A  can  be  represented  by  using 
a  probability  function.  Let  Pj^(=  |AkJ/|i4|)  be  the  probability  that  an  arbitrary 
problem  in  A  is  solvable  by  .  Let  Mk,  be  a  null  planner  which  cannot  solve  any 
problem;  that  is  A/^  =  ^  and  P*o=0.  Let  A‘i^.  =  Aki—Aki.^,  for  l<t<n,  be  the  set 
of  problems  which  are  solvable  by  Mki  but  not  by  and  let  P^.  =  |AJ^|/|A|, 

for  l<i<n.  Let  s{Mi^^As)  —  l{Mi^,As)  =  0,  for  any  Aj,  and  /{Mk^^Ap)  =  0,  for 
any  Ap. 

For  a  problem  set  A,  the  planning  time  of  a  complete  sequential  muli-method 
planner  A/*j— ♦Mk,  can  be  rewritten  as  the  sum  of  the  average  planning 
time  for  the  disjoint  problem  sets  Afc.(l<t<n): 

«sl 

where 

(4.3) 

The  planning  time  of  the  corresponding  time-shared  multi-method  planner  can 
be  rewritten  as 

where 

«(AfkJ|A/k,||...||Af*„AU  = 

s(Afk.,A;.)  -h  Y:  - - tJtI - •  (4-4) 

The  relative  performance  between  a  complete  sequential  multi-method  planner 
and  the  corresponding  time-shared  multi-method  planner  depends  on  the  ordering 
of  the  methods  and  the  cost  of  /(A/*^ ,  {a})  and  s{Mk,,  {a}).  If  Af*,  -♦A/fc,— A/t„ 
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is  monotonic,  later  methods  (Afj,j=t+l,...,n)  would  not  ful  to  solve  a,  from  the 
definition.  Thus, 

((n  -  »  +  1)  ♦  Ai.)  +  Y. - - ilT-i - •  (4.5) 

In  particular,  if  f{Mk^,  {a})  <  a(Affcj,  {a})  (l<i<t— 1)  for  each  a,  we  have 


s(Afk,|lAf*,l|...||Affc.,A;j  = 


•-1 


((n  -  i  + 1)  .  Alt,)  +  £ /(M., ,  Al,). 


(4.6) 


Thus,  the  performance  difference  between  a  monotonic  multi-method  planner  amd 
the  corresponding  time-shared  multi-method  planner  is 


s{Mk,  ||Mfc,|l...||M*„  A)  -  A)  = 

•si 


(4.7) 


This  implies  that  if  the  cost  of  fiulure  for  a  restricted  planner  is  ailways  less 
than  the  cost  of  success  for  a  more  relaxed  planner,  then  monotonic  multi-method 
planners  outperform  corresponding  time-shaured  multi-method  planners;  otherwise, 
time-shared  multi-method  plaumers  may  perform  better.  This  adl  depends  on  the 
relative  search  space  size  for  the  restricted  plamner  amd  the  density  amd  distribution 
of  solutions  in  the  seaurch  space.  However,  if  the  biases  used  in  a  restricted  method 
aire  strong  enough  to  cut  off  all  the  failure  paths  at  shallow  depths,  the  cost  to 
determine  whether  a  method  fuls  may  be  less  than  the  cost  to  determine  whether 
a  method  succeeds.  Moreover,  if  the  rule  learned  from  a  failure  path  can  transfer 
to  other  failure  paths,  the  cost  of  failure  can  be  even  less. 

The  performance  of  monotonic  multi-method  planners  auid  single-method  plan¬ 
ners  is  compared  as  follows.  For  each  monotonic  multi-method  planner  Aft,  — » 
...— there  is  a  corresponding  single- method  planner  Mk^  which  has  the  same 
coverage  of  solvable  problems.  If  Af*,— »...—►  A/fc„  is  complete,  Mk„  is  also 
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complete.  We  compare  a  complete  monotonic  multi-method  planner  with  its  cor¬ 
responding  single-method  planner  in  terms  of  planning  time. 

The  performance  of  Mkn  is 

=  (4.8) 

<«1 

To  compare  the  performance  of  Af*,  — »Af*, with  Af*,,  it  is  necessary 
to  subtract  (4.3)  from  (4.8),  yielding 

~  «(A/fci— ►  A/*, »A/*^,X)  = 

Ein.  •  Al,)  -  A'^)  -  £/(M»,,A'^))].  (4.9) 

i»l  js\ 

This  means  that  if  the  performance  gain  by  using  a  cheaper  method  )  — 

3(Mki,A'ic^))  is  greater  than  the  wasted  time  from  using  inappropriate  methods 
(I^jsi  in  a  monotonic  multi-method  planner,  then  it  is  preferable  to 

use  that  method  over  the  single-method  planner;  otherwise,  the  single-method 
planner  is  preferred  (at  least  where  planning  time  is  concerned). 

Plan  length:  The  plan  length  /  for  a  complete  monotonic  multi-method  plan¬ 
ner  and  for  the  corresponding  time-shared  multi-method  planner  is  the  same  and 
equal  to 

/(Affc.-»Af*,-...-Aft„,A)  =  /(Af*.l|A/*,||...||A/)t„,  A)  = 

D«.*l(Mfc,A'fc)l,  (4.10) 

isl 

while  the  plan  length  for  the  corresponding  single-method  planner  Af^,  is 

A)  =  EK  .  1(M».,  At,)).  (4.11) 

•si 

Since  Af*,— ►Af*, Af*„  is  monotonic,  then  /(Af*,,A*J  <  /(Af*„,A*J.  There¬ 
fore,  the  lengths  of  plans  generated  from  a  monotonic  multi-method  planner  and 
the  corresponding  time-shared  multi-method  planner  are  always  less  than  or  equal 
to  the  length  of  plans  generated  from  the  corresponding  single-method  planner. 
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4.1.3  Learning  in  Multi-Method  Planning 


The  analytical  results  in  the  previous  section  show  that  monotonicity  can  yield 
performance  gain  in  sequential  multi-method  planning  by  using  cheaper  methods 
earlier  in  the  sequence.  The  performance  of  sequential  multi-method  planning  can 
be  further  improved  by  ameliorating  the  effects  of  wasting  effort  on  insufficient 
planners  via  learning  —  in  particular,  of  two  sorts.  The  first  sort  of  learning  is 
within-planner  learning  that  can  transfer  across  planners  (possibly  for  the  same 
problem).  If  a  projection  is  performed  within  one  planner,  and  the  results  of  the 
projection  depend  only  on  aspects  of  the  planner  that  are  shared  by  a  second 
planner,  then  it  should  not  be  necessary  to  repeat  that  projection  when  the  second 
planner  is  tried.  For  example,  a  rule  learned  from  a  plan  violating  goal  protection  in 
the  direct  goal-protection  planner  should  be  able  to  transfer  to  the  nonlinear  goal- 
protection  planner,  where  it  prevents  the  planner  from  reprojecting  along  paths 
that  violate  goal  protection. 

The  second  sort  of  learning  is  about  which  methods  to  use  for  which  classes  of 
problems.  To  the  extent  that  this  can  be  done,  the  effort  wasted  in  trying  inad¬ 
equate  methods  can  be  avoided  in  the  future.  In  our  Soar-based  implementation, 
bias  selection  is  structured  just  as  would  be  any  other  selection,  so  this  sort  of 
learning  can  happen  automatically  by  chtmking.  From  an  experiment  with  such 
learning.  Figure  4.2  shows  a  rule  learned  to  avoid  using  the  most  restricted  method 
—  that  is,  direct  goal-protection  —  under  specific  circumstances  where  there  is 
only  one  active  goal  conjunct  but  (at  least)  two  blocks  must  be  moved  to  achieve 
it.  This  rule  was  learned  during  the  first  problem  and  can  be  used  in  three  later 
problems  to  avoid  even  trying  this  method. 

Though  we  have  examined  instances  of  learning  about  which  methods  to  use  for 
which  classes  of  problems  in  the  context  of  multi-method  planning,  no  systematic 
study  has  yet  been  made  of  their  effectiveness  or  of  whether  issues  of  overgener¬ 
alization  and/or  undergeneralization  will  prove  troublesome,  which  they  are  likely 
to  be.  Another  reason  why  this  form  of  leau-ning  is  not  used  is  that  the  current 
multi-attribute  encoding  creates  some  expensive  chunks  for  some  of  the  problems. 
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Figure  4.2:  Example  of  learning  which  planners  to  use  for  which  classes  of  problems: 

(a)  a  learned  rule  to  avoid  the  direct  goal-protection  planner,  (b)  a  class  of  problems 
in  which  this  rule  is  applicable. 
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Future  work  should  include  rerunning  the  experiments  summarized  in  Table  4.1 
with  this  form  of  learning  enabled. 

I 

4.2  Bias-Relaxation  Multi-Method  Planners 


Section  4.1.1  showed  an  approach  to  creating  monotonic  multi- method  planners 
by  using  a  restricted  dominance  graph.  This  approach  is  quite  straightforward  in 
the  sense  that  each  pair  of  methods  is  directly  compared.  However,  given  a  set  of 
methods,  it  does  not  specify  which  methods  should  be  created  and  which  pairs  of 
methods  should  be  compared.  Let  k  be  the  number  of  biases.  Then,  the  number 
of  single  methods  generated  by  every  combination  of  these  biases  is  0(2*').  The 
number  of  comparisons  for  creating  a  restricted  dominance  graph  for  these  methods 
is  0{2^).  Although  it  is  tractable  to  generate  all  monotonic  multi-method  planners 
by  using  a  restricted  dominance  graph  with  a  small  set  of  initial  biases  —  as  in  the 
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case  of  Section  4.1.1  —  it  may  not  be  tractable  if  the  number  of  biases  considered 
is  increased.  Therefore,  a  scheme  is  needed  to  restrict  the  scope  of  methods  to  be 
generated  and  compared. 

One  way  to  remedy  this  problem  is  to  compare  the  effectiveness  of  each  bias  in 
isolation,  instead  of  comparing  the  performance  of  methods  which  are  generated 
with  respect  to  combinations  of  these  biases.  The  approach  presented  in  this  section 
is  based  on  bias-relaxation  [Lee  and  Rosenbloom,  1993].  Bias-relaxation  multi¬ 
method  planners  can  be  created  as  combinations  of  effective  biases  only,  so  that 
later  methods  can  embody  subsets  of  the  effective  biiues  incorporated  into  earlier 
methods^.  Method  switching  is  implemented  by  relaxing  some  of  these  biases;  that 
is,  planning  starts  with  a  set  of  effective  biases,  and  then  successively  relaxes  one 
or  more  biases  until  a  solution  is  found  within  the  method.  This  can  be  formalized 
as  follows. 

Let  Bki  be  the  set  of  biases  used  in  A  bias  b  is  called  effective  in  a  problem 
set  A  and  a  method  set  {Mki),  if  for  a  piur  of  methods  Mk,  and  Mk^  in  {Mki }  such 
that  Bk,  s  {6}  and  B*,  = 

(1)  s(M*,,A*.)  <  s(Af*,,A*,),  and 

(2)  liMk„AkJ<l{Mk„Ak,). 

A  sequential  multi-method  planner  Mfc,— »...—» A/*,  is  called  a  bias-relaxa¬ 
tion  multi-method  planner,  if 

(1)  D  B*i,  for  2<i<n,  and 

(2)  Bk,.,-Bk,  consists  of  effective  biases  only,  for  2<i<n. 

Given  a  set  of  k  biases,  the  time  complexity  of  testing  whether  these  biases 
are  effective  or  not  is  0{k)  (by  factoring  out  the  complexity  of  solving  problems), 
which  is  exponentially  smaller  than  C?(2**). 

^Positive  bias,  as  defined  in  [Lee  and  Rosenbloom,  1993],  is  a  different  notion  from  effective 
#  bias  here  in  that  the  effectiveness  of  a  positive  bias  is  evaluated  along  with  other  biases. 
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The  results  in  Table  4.1  imply  that  directness  and  protection  are  effective  in 
the  blocks-world  domain,  while  linearity  is  not,  since  and 

/(Afs,  A])  >  /(Ms,  As).  If  one  uses  linearity  as  an  independent  bias  —  so  that  one 
set  of  multi-method  planners  is  generated  using  it  and  one  set  without  it  —  and 
vary  directness  and  protection  within  the  individual  multi-method  planners,  we 
get  a  set  of  ten  different  bias-relaxation  multi-method  planners  (four  three-method 
planners  and  six  two-method  planners)  as  shown  in  Table  4.2.  In  the  machine- 
shop  scheduling  domain,  only  protection  is  dfective.  In  consequence,  only  one 
bias-relaxation  multi-method  planner  is  generated 


Type  I  Multi-method  planners  |j  Type  Multi-method  planners 


Linear 


Ml  — »Mj— *M6 

Ml  — ♦M4— ♦Mb 
Ml -♦Mb 

M4— ♦Mb 


Nonlinear 


Ml  —♦M3— ♦Ms 
Mi-^M4-^M6 
Ml— ♦Mb 
M3— ♦Ms 
M4— ♦Ms 


Table  4.2:  Ten  bias-relaxation  multi-method  planners  in  the  blocks-world. 


Note  that  if  there  are  no  interactions  among  effective  biases,  a  bias-relaxation 
multi-method  planner  is  a  special  case  of  a  monotonic  multi-method  planner.  How¬ 
ever,  this  is  not  necessarily  true  if  there  are  interactions  among  them.  For  example, 
although  directness  is  effective,  this  does  not  necessarily  mean  that  the  method 
that  uses  directness  aad  protection  is  more  efficient  than  the  method  that  uses 
protection  only. 

In  order  to  generate  monotonic  multi-method  planners  via  bias-relaxation,  one 
cui  just  test  whether  monotonicity  holds  for  the  created  bias-relaxation  multi¬ 
method  planners.  The  time  complexity  for  this  procedure  is  linear  in  terms  of 
the  number  of  biases,  because  at  least  one  bias  is  rel£Oced  whenever  a  method  is 
switched. 

In  the  next  section,  the  experimental  results  for  all  of  the  created  bias-relzixation 
multi-method  planners  are  presented. 
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(a)  BlockS'World  domain. 


Planner 


Deasions 


Ai 


Af4,  Ms,  Af(g 

Mj— ^Ms,  Afa— ►Me 


Plan  length 


Ai  A, 


3  4.47 

3  3.58 


i 


(a)  Machine-shop  scheduling  domain. 

Table  4.3:  Single-method  and  bias-rel^ucation  multi-method  planning. 


4.2.1  Experimental  Results 

We  have  implemented  the  ten  bias-relaxation  multi-methods  planners  in  Soar6. 
Each  single-method  planner  in  a  bias-relaxation  multi-method  planner  was  imple¬ 
mented  as  a  specialization  of  a  general  problem-space.  Based  on  the  sequence  of 
single-method  planners,  a  set  of  meta-level  control  rules  was  provided  to  coordinate 
which  problem-space  is  tried  next  if  the  current  problem-space  does  not  generate 
a  plan  for  the  given  problem.  Only  within-trial  learning  was  turned  on  for  each 
problem,  as  in  the  experiments  with  the  single-method  planners,  but  learned  rules 
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were  also  allowed  to  transfer  from  an  earlier  method  to  a  later  method  (for  the  same 
problem).  This  is  equivalent  to  the  type  of  transfer  allowed  in  the  single-method 
planners,  because  the  scope  of  transfer  is  limited  to  the  current  trial  only. 

Table  4.3  compares  the  ten  bias-relaxation  multi-method  planners  with  the 
two  complete  single-method  planners  over  the  test  set  of  100  randomly  generated 
problems  used  in  Chapter  3  (this  test  set  is  different  from  the  30- problem  training 
set  used  in  developing  the  multi-method  planners).  Paired-sample  Z- tests  axe  made 
for  the  average  performance  on  As  —  because  it  is  the  only  complete  problem 
set  in  this  domain  —  between  bias-relaxation  multi-method  planners  and  single¬ 
method  planners.  The  results  reveal  that  bias-relaxation  multi-method  planners 
take  significantly  less  planning  time  (z=:2.27,  p<.05),  and  generate  significantly 
shorter  plans  than  single-method  planners  (2=4.86,  p<.01).  In  the  machine-shop 
scheduling  domain,  paired-sample  Z-tests  are  made  for  the  average  performance  on 
A4.  The  results  show  that  bias-relaxation  multi-method  planners  taike  slightly  more 
planning  time  than  single-method  planners;  however,  no  significance  is  found  at  a 
5%  level  (z=1.00).  In  terms  of  plan  length,  bias-relaxation  multi-method  planners 
generate  significantly  shorter  plans  than  single-method  planners  (2=3.15,  p<.01) 
in  this  domain  also. 

Although  it  has  been  shown  that  bias-relaxation  multi-method  planners  can 
outperform  single-method  planners  (in  the  blocks- world  domain),  it  does  not  nec¬ 
essarily  mean  that,  for  all  situations,  there  exists  a  bias-relaxation  multi-method 
planner  which  outperforms  the  most  efficient  single-method  planner.  In  fact,  the 
performance  of  these  planners  depends  on  the  biases  used  in  the  bias-relaxation 
multi-method  planners  and  the  problem  set  used  in  the  experiments.  For  example, 
if  the  problems  are  so  complex  that  most  of  the  problems  solvable  only  by 
the  least  restricted  method,  the  performance  loss  by  trying  inappropriate  earlier 
methods  in  multi-method  planners  might  be  relatively  considerable.  On  the  other 
hand,  if  the  problems  are  so  trivial  that  it  takes  only  a  few  decisions  for  the  least 
restricted  method  to  solve  the  problems,  the  slight  performance  gain  by  using  more 
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restricted  methods  in  multi-method  planners  might  be  overridden  by  the  complex¬ 
ity  of  the  meta-level  processing  required  to  coordinate  the  sequence  of  primitive 
planners. 

4.3  Fine-Grained  Multi-Method  Planners 

The  approach  to  multi-method  planning  described  so  far  starts  with  a  restricted 
method  and  switches  to  a  less  restricted  method  whenever  the  cuirent  method  fails. 
This  switdi  is  always  made  on  a  problem-by-problem  basis.  However,  this  is  not 
the  only  granularity  at  which  methods  could  be  switched.  The  family  of  multi¬ 
method  planning  systems  can  be  viewed  on  a  granularity  spectnun.  While  in 
coar  grained  multi-method  planners,  methods  are  switched  for  a  whole  problem 
when  no  solution  can  be  found  for  the  problem  within  the  current  method,  in 

fine-grained  multi-method  planners  (denoted  as  _ methods  can  be 

switched  at  any  point  during  a  problem  at  which  a  new  set  of  subgoals  is  formulated, 
and  the  switch  only  occurs  for  that  set  of  subgoals  (and  not  for  the  entire  problem) 
[Lee  and  Rosenbloom,  1993].  At  this  finer  level  of  granularity  it  is  conceivable 
that  the  planner  could  use  a  highly-restricted  and  efficient  method  over  much  of  a 
problem,  but  fall  back  on  a  nonlinear  method  without  protection  for  those  critical 
subregions  where  there  are  tricky  interactions. 

With  this  flexibility  of  method  switching,  fine-grained  multi-method  planning 
can  potentially  outperform  both  coarse-grained  multi-method  planning  and  single¬ 
method  planning.  Compared  with  coarse-grained  multi-method  planning,  it  can 
save  the  effort  of  backtracking  when  the  current  method  can  not  find  a  solution  or 
the  current  partial  plan  violates  the  biases  used  in  the  current  method.  Moreover, 
it  can  save  the  extra  effort  of  using  a  less  restricted  method  on  later  parts  of  the 
problem,  just  because  one  early  part  requires  it.  As  compared  with  single-method 
planning,  a  fine-grained  multi-method  planner  can  utilize  biases  which  would  cause 
incompleteness  in  a  single-method  planner  —  such  as  directness  or  protection  in 
the  blocks-world  domain  —  while  still  remaining  complete.  The  result  is  that  a 


Average 

Decisions 

Single 

Coarse- 

Grained 

Planning 

Single 

38.58 

- 

- 

Type 

Coarse-Grained 

29.98 

2.27* 

- 

Fine-Gr^ed 

16.44 

5.37** 

6.72** 

(a)  Decisions. 


Average 
Plan  Length 

Single 

EESm 

Planning 

Single 

3.98 

- 

- 

Type 

Coarse-Grained 

2.82 

4.86** 

- 

Fine-Grained 

3.04 

3.42** 

1.77 

(b)  Plan  length 


Table  4.5:  Significance  test  results  for  the  blocks-world  domain. 


fine-grained  multi-method  planner  can  potentially  be  more  efficient  than  a  single¬ 
method  planner  that  has  the  same  coverage  of  solvable  problems. 


4.3.1  Experimental  Results 

Table  4.4  compares  the  bias-relaxation  fine-grwned  multi-method  planners  with 
the  corresponding  bias-relaxation  coarse-grained  multi-method  planners  and  (com¬ 
plete)  single-method  planners  over  the  same  100  test  set  as  used  in  Table  4.3  in 
the  blocks-world  domain.  Paired-sample  Z-tests  on  this  data,  as  shown  in  Ta¬ 
ble  4.5,  reveal  that  fine-grained  multi-method  planners  take  significantly  less  plan¬ 
ning  time  than  both  single-method  planners  (r=5.35,  p<.01)  and  coarse-granned 
multi-method  planners  (a:=6.72,  p<.01).  This  likely  stems  from  fine-grain  multi¬ 
method  planners  preferring  to  search  within  the  more  efficient  spaces  defined  by 
the  biases  —  thus  tending  to  outperform  single-method  planners  —  but  being  able 
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Table  4.6:  Single-method  and  coarse-grained  multi-method  vs.  iine-graiiied  multi¬ 
method  planning  in  the  machine-shop  scheduling  domiun. 


to  recover  from  bias  failure  without  throwing  away  everything  already  done  for  a 
problem  (thus  tending  to  outperform  coarse-grained  multi-method  planners). 

Fine-grained  multi-method  planners  also  generate  significantly  shorter  plans 
than  single-method  planners  (2=3.42,  p<.01).  They  generate  slightly  longer  plans 
than  coarse-grained  multi-method  planners;  however,  no  significance  is  found  at  a 
5%  level  (2=1.77).  These  results  likely  arise  because,  whenever  possible,  both  types 
of  multi-method  planners  use  the  more  restrictive  methods  that  yield  shorter  plau 
lengths,  while  there  may  be  little  difference  between  the  methods  that  ultimately 
succeed  for  the  two  types  of  multi-method  planners. 

Table  4.6  illustrates  the  performance  of  these  three  types  of  planners  over  the 
satme  test  set  of  100  problems  used  in  Table  4.3  in  the  machine-shop  scheduling 
domiun.  As  with  the  blocks-world  domain,  paired-sample  z-tests  in  the  scheduling 
domain,  as  shown  in  Table  4.7,  indicate  that  fine-grained  planners  dominate  both 
single-method  planners  (2=10.91,  p<.01)  and  coarse-grained  planners  (2=8.95, 
p<.01)  in  terms  of  planning  time.  Fine-grained  planners  also  generate  significantly 
shorter  plans  than  do  the  single-method  planners  (2=6.49,  p<.01).  They  gener¬ 
ate  slightly  shorter  plans  than  coarse-grained  multi-method  planners;  however,  no 
significance  is  found  at  a  5%  level  (2=1.28). 

Figures  4.3  and  4.4  plot  the  average  number  of  decisions  versus  the  average 
plan  lengths  for  the  data  in  Tables  4.4  and  4.6.  These  figures  graphically  illustrate 
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Figure  4.3;  Performance  of  single-method  planners  (-I-),  coarse-grained  multi¬ 
method  planners  (o),  and  fine-grained  multi-method  planners  (*)  in  the  blocks- 
world  domain. 


Figure  4.4:  Performance  of  single-method  planners  (+),  coarse-grained  multi¬ 
method  planners  (o),  and  fine-grained  multi-method  planners  (’*')  in  the  scheduling 
domain. 
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Planning  Type 

Average 

Decisions 

Single 

Coarse- 

Grained 

Planning 

Single 

■Rl 

- 

Type 

Coarse-Grained 

- 

Fine-Grained 

8.95** 

(b)  The  significance  test  result  for  decisions. 


1  Planning  Type 

Single 

Coarse- 

Grained 

Planning 

Single 

4.47 

WKM 

- 

Type 

Coarse-Grained 

3.58 

- 

Fine-Grained 

3.29 

1.28 

(c)  The  significance  test  result  for  plan  lengths. 


Table  4.7:  Significance  test  results  for  the  machine-shop  scheduling  domain. 


how  the  coarse-grained  approach  primarily  reduces  plan  length  in  comparison  to 
the  single-method  approach,  and  how  the  fine-grained  approach  primarily  improves 
efficiency  in  comparison  to  the  coarse-grained  approach. 


4.4  Comparison  with  Partial- Order  Planning 

Partial-order  planning  can  be  more  efficient  than  total-order  planning,  because 
partial-order  planning  avoids  premature  commitment  to  an  incorrect  ordering  be¬ 
tween  operators,  and  thus  reduces  the  size  of  the  search  space  [Minton  et  aL,  1991, 
Barrett  and  Weld,  1992].  In  particular,  Barrett  and  Weld  [1992]  showed  experi¬ 
mentally  that  the  total-order  planner  TOCL  exhibited  apparently  exponential  time 
complexity  while  the  partial-order  planner  POCL  maintained  near-linear  perfor¬ 
mance,  as  the  number  of  problems  was  increased  in  the  domain.  This  sec¬ 

tion  compares  multi-method  planning  with  partial  order  planning,  and  shows  that 
multi-method  planners  can  perform  as  well  as  partial-order  planners  in  this  domain. 


€ 


« 
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A  template  for  generating  operators  in  is  illustrated  as  follows: 

{define— operator  :  action  Ai  :  precondition  {/<}  :  add  {Gi} 

:  delete  {/,»i)). 

Note  that  operator  A,-  deletes  the  preconditions  of  operator  This  implies 

that  for  any  problem  in  this  domain,  there  exists  a  single  ordering  of  operators 
which  solves  that  problem. 

Planning  in  TOCL  is  similar  to  planning  in  Me  (without  learning)  in  the  sense 
that  both  are  complete  and  search  over  the  space  of  sequences  of  task  operators.  In 
the  search,  operators  can  be  tried  which  are  not  provably  right,  and  backtracking 
can  occur  if  the  choice  is  wrong.  In  contrast,  POOL  searches  over  the  space  of 
ordering  constraints.  It  tries  out  constriunts  that  may  be  right,  and  then  back¬ 
tracks  over  them  if  they  prove  wrong.  In  general,  POOL  can  outperform  TOCL  by 
avoiding  premature  step-ordering  constraints. 

The  performance  of  TOCL  can  be  improved  by  adding  EBL.  The  idea  is  that 
if  a  control  rule  is  learned  by  EBL  when  a  failure  occurs  and  this  rule  can  cut  off 
all  similar  failure  paths,  then  TOCL  may  perform  as  well  as  POCL.  However,  this 
may  not  lead  to  linear  performance,  because  EBL  is  committing  even  less  than  least 
commitment  planning  with  respect  to  adding  constraints.  In  EBL,  constraints  (i.e. 
preference  rules)  are  generated  only  when  they  are  provably  correct,  and  they  are 
never  backtracked  over.  Proving  that  a  constraint  is  correct  can  be  a  non-trivial 
task,  and  may  not  guarantee  a  polynomial  complexity  in  planning  time.  On  the 
other  hand,  in  POCL  the  added  constraints  are  not  proved  correct.  In  the  do¬ 
main,  however,  the  constraints  generated  by  POCL  do  work  without  backtracking, 
since  there  is  no  operator  which  adds  a  precondition  of  another  operator. 

Bias-relaxation  multi-method  planning  can  improve  the  performance  of  TOCL, 
because  a  bias  allows  learning  constraints  based  on  weaker  proofs  if  they  prove 
wrong,  and  multi-method  planning  allows  backtracking  over  these  constraints.  In 

means  that  there  is  one  entry  in  its  operator’s  delete  set  and  it  only  takes  one  step  to 
achieve  a  goal. 
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this  domain,  a  bias,  called  prtcondition  protection  is  used,  which  eliminates  all  plans 
in  which  disachieved  preconditions  are  reachieved.  A  multi-method  planner  can  be 
constructed  which  consists  of  a  method  using  precondition  protection  (denoted 
as  Mp)  and  one  without  it  —  that  is,  the  least  restricted  planner  (Ms).  In  fact, 
a  precondition  protection  bias  is  weak  enough  to  solve  all  problems  (thus  Mp  is 
complete)  in  this  domun.  However,  Mp  itself  may  not  be  complete  for  other 
domains.  In  that  case.  Backtracking  across  incorrect  preference  mles  (that  is  across 
methods)  can  happen  in  Mp-*M^. 


■IM 

Mp-  iir 

?OC] 

L 

Number  of 
Goals 

Number  of 
Nodes  (5) 

Cr'(J  Tjne 
(Sec.) 

0.03 

0.05 

0.06 

0.10 

5 

5 

0.13 

6 

6 

12 

0.15 

7 

14 

0.17 

8 

0.14 

16 

0.19 

9 

0.17 

18 

0.S3 

10 

10 

0.18 

20 

0.26 

11 

11 

0.20 

22 

0.31 

12 

12 

0.21 

24 

0.31 

13 

13 

0.24 

26 

0.33 

Table  4.8;  Experimental  results  for  Mp—*M^  and  POCL 


Table  4.8  shows  experimental  results  for  Mp—*Me  and  POCL  for  13  problems 
in  the  domain.  In  terms  of  the  number  of  nodes  visited,  both  planners  show 
linear  performance.  Note  that  the  definitions  of  node  in  the  two  planners  are  dif¬ 
ferent  because  their  search  spaces  are  different.  In  Mp—*Ms,  a  node  represents  an 
element  of  the  space  of  operator  sequences  (5),  whereas  in  POCL  a  node  repre¬ 
sents  an  element  of  the  space  of  the  set  of  operators  (5')  plus  the  set  of  ordering 
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constraints  among  them  (0).  This  explains  the  factor  of  two  between  these  two 
columns. 

The  data  for  CPU  time  also  exhibit  (near)  linear  performance  in  both  planners. 
Although  the  simulation  for  these  planners  is  done  on  the  same  machine,  the  dif¬ 
ferences  in  CPU  time  for  these  two  planners  do  not  imply  much,  because  they  are 
coded  in  different  languages  (Soar  on  top  of  C  for  Afp-^Afe  and  Lisp  for  POCL). 
Nevertheless,  the  (near)  linear  performance  of  Afp— »Afe  suggests  that  A/p— ^Afe  can 
perform  as  well  as  POCL,  in  this  domain. 

4.5  Summary 

In  this  chapter  the  notion  of  monotonicity  in  sequential  multi-method  planning 
is  investigated.  In  a  monotonic  multi-method  planner,  the  single  methooa 
sequenced  according  to  increasing  coverage  and  decreasing  efficiency.  A  formal 
analysis  shows  that  (1)  if  the  cost  of  fiulure  for  a  restricted  planner  is  always  less 
than  the  cost  of  success  for  a  more  relaxed  planner,  then  monotonic  multi-method 
planners  outperform  corresponding  time-shared  multi-method  planners;  otherwise, 
time-shared  multi-method  planners  may  perform  better;  (2)  a  monotonic  multi¬ 
method  planner  takes  less  planning  time  than  the  corresponding  single-method 
planner,  if  the  performance  gain  by  using  a  cheaper  method  is  greater  than  the 
wasted  time  by  using  inappropriate  methods  in  the  monotonic  multi-method  plan¬ 
ner;  and  (3)  the  lengths  of  plans  generated  from  a  monotonic  multi-method  planner 
and  the  corresponding  time-shared  multi-method  planner  are  less  than  or  equal  to 
the  length  of  plans  generated  from  the  corresponding  single-method  planner. 

A  set  of  bias-relaxation  multi-method  planners  has  been  constructed.  In  bias- 
relaxation  multi-method  planning,  each  bieis  is  evaluated  in  isolation.  Thus,  bias- 
relaxation  multi-method  planning  has  a  restricted  scope  in  creating  and  compar¬ 
ing  individual  methods.  The  constructed  bias-relaxation  multi-method  planners 
vary  in  the  granularity  at  which  individual  methods  are  selected  and  used.  De¬ 
pending  on  the  granularity  of  method  switching,  two  variations  on  bias-rel2ocation 
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multi-method  planners  are  implemented:  coarse-grained  multi-method  planners, 
where  methods  are  switched  on  a  problem-by-problem  basis;  and  fine-grained  multi¬ 
method  planners,  where  methods  are  switched  on  a  goal-by-goal  basis. 

The  experimental  results  in  the  blocks-world  and  machine-shop-scheduling  do- 
muns  imply  that  (1)  in  terms  of  planning  time,  fine-grained  multi-method  plan¬ 
ners  can  be  significantly  more  efficient  than  coarse-gruned  multi-method  planners 
and  single-method  planners;  and  (2)  in  terms  of  plan  length,  both  fine-grained 
and  coarse-grained  multi-method  planners  can  be  significantly  more  efficient  than 
single-method  planners. 

Finally,  the  comparison  of  multi-method  planning  with  partial-order  planning 
in  suggests  that  multi-method  planning  can  be  as  efficient  as  partial-order 
planning  in  terms  of  planning  performance. 
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Chapter  5 

Application  to  a  Complex  Domain 


The  investigations  of  multi-method  planning  in  the  previous  chapter  have  occurred 
in  the  context  of  the  blocks- world  and  machine-shop  scheduling  domains.  These 
are  classical  planning  domains  that  provide  good  environments  for  developing  and 
evaluating  multi-method  planners.  However,  the  intent  her*;  is  to  transfer  the  multi¬ 
method  planning  technology  to  a  more  realistic  domain;  in  particular,  a  simulated 
battlefield  domain. 

The  task  focused  on  in  this  domain  is  to  simulate  automated  intelligent  agents 
that  can  accomplish  tactical  missions  in  navy  fighters.  One  interesting  aspect  of 
this  domain  is  that  the  main  criterion  to  evaluate  planning  is  how  well  the  missions 
can  be  accomplished,  whereas  planning  time  and  plan  length  are  secondary  criteria. 
This  chapter  shows  how  the  multi-method  planning  framework  can  be  applied  to 
domains  with  such  a  criterion. 

Since  this  domain  involves  the  complexity  of  the  real  world  and  the  domain  it¬ 
self  is  not  clearly  defined,  the  full  implementation  of  a  planner  that  can  be  used  in 
such  agents  is  beyond  the  scope  of  this  thesis.  The  focus  in  this  thesis  is  on  inves¬ 
tigating  the  issues  related  to  planning  in  this  domain  and  demonstrating  planning 
capabilities  that  multi-method  planners  have,  rather  than  developing  a  real  planner 
that  can  actually  be  deployed.  The  application  of  multi-method  planning  to  this 
domain  will  help  both  in  evaluating  the  degree  of  domain  independence  provided  by 
the  multi-method  planning  framework,  as  well  as  being  a  step  toward  integrating 
the  technology  into  a  broader  agent. 


85 


( 


This  chapter  begins  with  an  overview  of  simulated  battlefield  environments  and 
describes  the  tactical  air  simulation  task.  Then,  it  demonstrates  how  multi-method 
planning  can  be  applied  to  this  task. 


5.1  Simulated  Battlefield  Environments 

The  goal  of  the  work  in  a  simulated  battlefield  environment  is  to  create  agents  that 
act  as  virtual  agents  to  participate  in  exercises  with  real  human  agents.  These 
exercises  are  to  be  used  for  training  as  well  as  for  development  of  tactics.  In  order 
for  these  exercise  to  be  realistic,  the  agents  must  be  able  to  behave  as  much  like 
humans  as  possible. 

To  approximate  human  behaviors,  the  agents  must  have  capabilities  including 
obeying  tactical  missions,  planning  and  reacting  in  real  time,  adapting  to  new  situ¬ 
ations,  learning  from  experience,  exhibiting  the  cognitive  limitations  and  strengths 
of  humans,  interacting  with  other  agents,  and  so  on.  Developing  agents  with  such 
capabilities  is  a  non-trivial  task  with  many  real-world  complexities. 

Soar-IFOR  is  an  attempt  to  build  such  agents  within  the  Soar  architecture. 
Soar  is  a  promising  candidate  for  developing  such  agents,  because  it  is  a  single 
unified  system  which  can  integrate  various  components  of  AI  technologies  such 
as  problem-solving,  planning,  reasoning,  learning,  perception,  motor  control,  and 
so  on.  In  addition.  Soar  is  the  basis  for  the  development  of  unified  theories  of 
human  cognition  (Newell,  1990],  and  thus  can  provide  an  appropriate  framework 
for  modeling  human  like  agents. 

To  begin  the  effort  to  build  automated  intelligent  agents  for  simulated  battle¬ 
field  environments,  Soar-IFOR  has  mainly  focused  on  creating  specific  automated 
agents,  called  TacAir-Soar,  for  simulated  tactical  air  environments  [Jones  et  ai, 
1993,  Rosenbloom  et  ai,  1994). 
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5.2  Tactical  Air  Simulation 

The  goal  of  building  TacAir-Soar  is  to  construct  automated  intelligent  agents  for 
Sight  simulators  that  are  used  to  train  navy  pilots  in  Sight  t2u;tics.  For  example,  it 
can  be  used  in  simulating  a  Barrier  Combat  Air  Patrol  (BARCAP)  mission;  that  is, 
to  patrol  the  skies  to  protect  a  High-Value  Unit  (HVU)  such  as  an  aircraft  carrier. 
During  the  course  of  the  mission,  if  the  agent  detects  a  hostile  aircraft,  it  intercepts 
the  aircraft  by  Sxing  missiles  and  then  resumes  its  patrol. 

One  of  the  important  characteristics  of  tactical  air  simulation  is  that  it  is  a 
highly  reactive,  real-time,  I/O  intensive  task.  Thus,  the  agents  must  be  able  to 
make  decisions  in  real-time  and  react  appropriately  according  to  the  changes  in  the 
environment.  On  the  other  hand,  it  is  a  highly  goal-oriented  (or  mission-oriented) 
task.  The  goals  include  accomplishing  multiple  missions  and  survival.  Thus,  the 
agent  must  have  a  planning  capability  which  can  deal  with  multiple  goals  at  the 
same  time. 

Dealing  with  multiple  goals  involves  the  following  issues;  how  to  represent  mul¬ 
tiple  goals  ?nd  their  interaction,  how  to  generate  appropriate  actions  that  satisfy 
multiple  goals  at  a  time,  how  to  decide  on  appropriate  actions  when  multiple  goals 
require  conflicting  behaviors,  which  goals  can  be  ignored  if  necessary,  and  so  on. 

The  next  section  presents  a  prototype  agent  which  employs  the  multi-method 
planning  technique  for  tactical  air  simulation,  and  demonstrates  how  multi-method 
planning  can  deal  with  these  multiple  goal  issues.  Although  TacAir-Soar  is  a  highly 
reactive  agent,  the  implementation  of  the  prototype  agent  based  on  the  multi¬ 
method  planning  technique  focuses  on  the  planning  capabilities  only,  and  not  on 
the  reactive  capabilities. 


5.3  Implementation 

The  application  of  multi-method  planning  in  tactical  air  simulation  is  based  on 
a  beyond-visual-range  (BVR)  1-v-l  aggressive  bogey  scenario  [Jones  et  ai,  1993, 
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Johnson,  1994,  Tambe  and  Rosenbloom,  1994].  This  scenario  involves  two  armed 
aircraft  with  similar  capabilities.  One  aircraft  (F14)  is  attempting  to  protect  a 
high-value  unit  and  the  other  (MiG29)  is  attempting  to  destroy  it.  When  the  two 
aircraft  come  in  contact,  they  both  attempt  to  intercept  and  destroy  each  other, 
with  the  overall  goals  of  accomplishing  their  missions  —  here,  protecting  the  HVU 
(or  attacking  the  HVU)  —  situational  awau'eness,  and  surviving. 


Figure  5.1:  A  skeleton  of  the  goal  hierarchy  for  the  1-v-l  aggressive  bogey  scenario. 


While  performing  BARCAP,  if  a  bogey  (an  unknown  aircraft)  is  noticed,  the 
F14  tries  to  determine  whether  the  bogey  is  a  bandit  (an  enemy  aircraft).  If  the 
bogey  is  identified  as  a  bandit,  the  F14  attempts  to  destroy  it  by  firing  missiles. 
In  order  to  destroy  it,  the  F14  selects  a  missile  —  a  long-range  missile  (LRM) 
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here  —  and  approaches  the  MiG29  close  enough  to  get  into  its  LRM’s  launch- 
acceptability  region  (LAR).  After  launching  a  LRM,  the  Fl4  makes  -’i  F-pole  (a 
maneuver  involving  a  25-50  degree  turn)  to  provide  radar  guidance  to  the  missile, 
while  decreasing  the  closure  between  the  two  aircraft.  The  fight  continues  until 
one  aircraft  is  destroyed  or  runs  away.  A  skeleton  of  the  goal  hierarchy  for  this 
scenario  from  the  F14’s  point  of  view  is  shown  in  Figure  5.1. 

The  key  issue  in  implementing  a  planner  in  this  domain  is  that  some  goals  can 
never  be  achieved  completely.  These  goals  are  called  maintenance  goals.  Protect- 
HVU,  situational'-avarenesa  and  survive  are  such  examples.  Thus,  the  role  of 
an  operator  for  a  maintenance  goal  is  not  to  achieve  the  goal  but  to  continuously 
maintain  the  status  of  the  goal.  For  example,  a  single  application  of  the  oper¬ 
ator  BARCAP  does  not  achieve  the  goal  protect-HVU.  By  applying  this  operator 
continuously,  the  HVU  remains  protected. 

Maintenance  goals  make  it  possible  to  define  multiple  achievement-levels  rather 
than  two  levels  —  achieved  and  unachieved.  For  example,  the  achievement-level 
for  the  goal  protect-HVU  is  maximum  when  there  is  no  threat  to  the  HVU,  while 
the  achievement-level  for  this  goal  is  minimum  when  the  HVU  is  destroyed.  When 
a  bogey  is  noticed,  the  achievement-level  is  decreased  from  the  maximum,  because 
the  bogey  can  potentially  destroy  the  HVU.  If  the  bogey  is  identified  as  a  bandit, 
the  achievement-level  is  further  decreased. 

From  the  opposite  point  of  view,  one  can  define  multiple  threat-levels  for  each 
maintenance  goal;  that  is  the  threat-level  is  maximum  when  the  achievement-level 
is  minimum,  and  vise  versa.  For  each  maintenance  goal,  operators  which  decrease 
the  threat-level  for  that  goal  are  proposed. 

The  multiple  threat-level  scheme  allows  the  notion  of  protection  to  be  refined 
for  maintenance  goals.  Instead  of  protecting  achieved  goals  from  being  undone, 
it  protects  the  threat-levels  for  other  goals  from  being  increased.  The  strongest 
form  of  protection  in  this  domain,  denoted  as  GPo,  eliminates  all  plans  in  which 
an  operator  increases  the  threat-level  of  another  goal.  Weaker  forms  of  protection, 
denoted  as  GPi  (t=l,  ...,n— 2,  where  n  is  the  number  of  threat-levels)  eliminates 
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all  plans  in  which  an  operator  increases  the  threat-level  of  another  goal  by  more 
than  t^. 


Goal  Flexibility  Dimension 
Directness  Nonlinear 


Table  5.1:  The  2xn  planning  methods  generated  from  directness  and  protection, 
where  there  are  ?t  threat-levels. 

The  notions  of  directness  and  linearity  are  not  changed  here.  However,  using 
a  linearity  bias  is  dangerous  in  this  domain,  because  focusing  on  only  one  goal 
conjunct  and  just  ignoring  other  goals  conjuncts  until  the  current  one  is  completely 
achieved  may  cause  a  failure  —  that  is,  either  the  aircraft  or  the  HVU  may  be 
destroyed.  This  yields  a  set  of  2xn  planning  methods  derived  from  the  two  bias 
dimensions  (Table  5.1). 

The  key  to  implementing  the  prototype  agent  is  that  the  threat-levels  for  main¬ 
tenance  goals  must  remain  as  low  as  possible.  By  doing  so,  the  probability  that 
one  (or  some)  of  the  maintenance  goals  is  seriously  threated  can  be  decreased. 
Single-method  planners  are  not  appropriate  because  if  their  biases  are  too  strong, 
they  cannot  solve  the  problem  without  seriously  threatening  other  goals.  On  the 
other  hand,  if  their  biases  are  too  weak,  planning  takes  too  much  time  since  the 
search  space  is  too  large. 

Based  on  the  planning  methods  shown  in  Table  5.1,  a  fine-grained  multi-method 
planner  is  implemented  for  the  1-v-l  scenario.  Figure  5.2  presents  an  example  of 

*  Another  possible  refinement  is  to  use  protection  biases  which  eliminate  all  plans  in  which  an 
operator  increases  the  total  threat-levels  across  all  other  goals  by  more  than  i. 
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(a)  Six  threat-levels  for  1-v-l  aggressive  bogey  scenario. 


Current  state:  The  bandit  is  aggressive. 

Proposed 

Threat-Level  Change 

Goal 

Operator 

Goal 

Before 

After 

protect-HVU 

DESTROY-BANDIT 

protect-HVU 

3 

2 

survive 

3 

4 

survive 

ESCAPE 

survive 

3 

1 

protect-HVU 

3 

5 

Selected  operator:  DESTROY-BANDIT. 

(b)  Operator  selection  when  the  bandit  is  aggressive. 


Table  5.2:  Example  of  fine-grained  multi-method  planning  for  tactical  air  domain. 


how  the  implemented  fine-grained  multi-method  planning  actually  works.  In  this 
implementation,  six  threat-levels  are  used  as  shown  in  Figure  5.2  (a). 

Figure  5.2  (b)  shows  how  an  operator  is  selected  when  multiple  operators  are 
proposed  in  the  situation  that  the  bandit  is  aggressive.  In  this  situation,  the  threat- 
levels  for  both  goals  protect-HVU  and  survive  are  set  to  three,  because  the  aggres¬ 
sive  behavior  of  the  bandit  is  considered  as  a  medium  level  of  threat  for  protecting 
the  HVU  and  surviving.  For  the  goal  protect-HVU,  the  DESTROY-BANDIT  opera¬ 
tor  is  proposed  which  can  decrease  the  threat-level  to  minor  threat.  For  the  goal 
survive,  the  ESCAPE  operator  is  proposed  which  can  decrease  the  threat-level  to 
potential  threat.  In  evaluating  these  operators,  however,  applying  escape  increases 
the  threat-level  for  the  other  goal  protect-HVU  to  fatal  threat,  whereas  applying 
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DESTROY-BANDIT  increases  the  threat-level  for  survive  to  major  threat.  Since 
DESTKOY-BANDIT  increases  the  threat-level  for  the  other  goal  less  than  SURVIVE, 
DESTROY-BANDIT  is  selected  here. 

5.4  Summary 

In  this  chapter,  how  multi-method  planning  can  be  applied  to  a  tactical  air  domain 
is  briefly  discussed.  A  preliminary  investigation  is  made  of  some  of  planning  issues 
in  this  domain  such  as  how  to  deal  with  maintenance  goals  and  how  to  decide  on 
appropriate  actions  when  multiple  goals  require  conflicting  behaviors.  In  doing  this, 
the  notion  of  protection  is  refined  such  that  one  protects  the  threat-levels  for  other 
goals  from  being  increased.  Multi-method  planning  based  on  refined  protection 
biases  shows  how  appropriate  actions  can  be  generated  by  this  planner. 
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Chapter  6 


Related  Work 


This  chapter  describes  work  related  to  the  multi-method  planning  framework.  Sec¬ 
tion  6.1  describes  biases  used  in  other  planning  systems.  Section  6.2  compares  the 
planning  and  learning  framework  in  Soar  to  other  planning  frameworks.  Finally, 
Section  6.3  compares  the  presented  multi-method  planning  technique  to  other  re¬ 
lated  approaches. 


6.1  Biases  in  Planning 

Some  of  the  planning  biases  used  here  have  been  introduced  by  earlier  planning 
systems  as  planning  heuristics.  For  example,  the  linearity  assumption  has  been 
used  in  planners  using  a  goal  stack  because  of  its  simplicity  [Fikes  and  Nilsson, 
1971].  Also,  protection  has  been  used  in  many  planners  to  reduce  the  size  of  the 
search  space  zmd  to  avoid  generating  non-optimal  plans.  These  two  biases  ue 
discussed  in  more  detail  here. 

6.1.1  Linearity 

In  a  conjunctive  goal  problem,  the  assumption  that  subgoals  can  be  achieved  se¬ 
quentially  and  thus  that  the  generated  plan  is  a  sequence  of  complete  subplans  for 


93 


the  conjunctive  goals  is  known  as  the  linearity  assumption  [Sussman,  1973].  Al¬ 
though  many  problems  cannot  be  solved  without  interleaving  goal  conjuncts,  this 
assumption  has  two  interesting  properties. 

First,  it  makes  the  original  problem  simpler  by  allowing  decomposition  of  the 
problem  into  a  set  of  subproblems  and  then  solving  each  subproblem  in  sequence. 
Since  only  a  single  goal  conjunct  is  considered  for  each  subproblem,  the  search 
space  to  solve  the  entire  problem  can  be  reduced. 

Second,  it  provides  a  basis  for  classifying  a  group  of  problems  in  terms  of  a 
problem’s  complexity.  Korf  [1987]  provided  a  more  refined  taxonomy  about  how 
subgoals  interact  with  each  other.  He  defined  a  set  of  subgoals  to  be  independent  if 
each  operator  only  changes  the  distance  to  a  single  subgoal.  Though  this  definition 
is  based  on  a  very  strong  assumption  about  goal  interference,  an  optimal  global 
solution  can  be  achieved  by  simply  concatenating  together  optimal  solutions  to  the 
individual  subproblems  in  any  order.  Solving  a  single  independent  subgoal  might 
be  nontrivial,  but  the  complexity  of  problems  with  independent  subgoals  increases 
only  linearly  with  the  number  of  subgoals. 

Also,  he  defined  a  set  of  subgoals  to  be  serializable  if  there  exists  an  ordering 
among  the  subgoals  such  that  the  subgoals  can  always  be  solved  sequentially  with¬ 
out  ever  violating  a  previously  solved  subgoal  in  the  order.  Since  this  definition  is 
based  on  the  linearity  assumption  and  goal  protection,  a  problem  which  consists 
of  serializable  subgoals  can  be  classified  as  an  element  of  A2  —  that  is,  the  set  of 
problems  solvable  by  the  linear  protection  method  —  in  the  multi-method  planning 
framework. 

Barrett  and  Weld  [1992]  defined  a  set  of  subgoals  to  be  trivially  serializable  if 
they  can  be  solved  in  any  order  without  ever  violating  a  previous  solved  subgoal. 
FVom  this  definition  it  is  implied  that  if  a  set  of  subgoals  is  independent,  it  is 
trivially  serializable,  and  that  if  a  set  of  subgoals  is  trivially  serializable,  it  is 
serializable. 
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6.1.2  Protection 


The  notion  of  protection  was  introduced  in  HACKER  [Sussman,  1973].  In  HACKER, 
a  protection  violation  is  detected  if  *‘a  protected  subgoal  is  clobbered  between 
the  time  it  is  established  and  the  time  it  is  no  longer  needed”  [Sussman,  1973, 
page  63).  HACKER  deals  with  protection  violations  by  employing  procedures  called 
critics  that  recognize  such  violations.  When  necessary,  HACKER  is  able  to  repair 
the  plan  by  rearranging  the  steps  in  the  plan. 

Waldinger  [1977]  developed  an  approach,  called  goal  rtgrtssioi  o  protect 
achieved  goals.  It  involves  creating  a  plan  to  solve  one  subgoal  followed  by  con¬ 
structive  modifications  to  achieve  the  other  subgoals.  It  differs  from  HACKER  in 
that  it  uses  the  notion  of  goal  protection  to  guide  the  linear  placement  of  actions  in 
the  plan.  Rather  than  building  incorrect  plans  and  then  debugging  them,  it  builds 
partial  linear  plans  in  non-sequential  order  and  moves  subgoals  backwards  through 
the  partial  linear  plans  to  where  they  do  not  interfere  with  other  subgoals.  Vere 
[1983]  also  developed  a  technique,  called  splicing^  which  relaxes  protection  when  it 
has  caused  a  deadlock. 

SNLP  uses  causal  links  to  represent  protection  intervals  and  deals  with  threats 
to  them. 


6.2  Planning  and  Learning  in  Soar 

In  the  thesis,  planning  operators  are  represented  by  operator  proposal  rules,  op¬ 
erator  application  rules,  goal  expansion  rules,  and  instantiated  Soar  operators  in 
working  memory.  However,  this  is  not  the  only  way  to  represent  planning  operators 
in  Soar.  For  example,  Unruh’s  [1993]  operator  representation  for  abstraction  in¬ 
cludes  rules  to  check  operators’  preconditions  explicitly  before  applying  operators. 
Also,  in  Soar,  goal  expansion  for  a  violated  precondition  is  usually  implemented 
by  creating  a  new  Soar  subgoal  and  achieving  the  violated  condition  within  the 
subgoal,  via  the  operator  subgoaling  scheme  [Laird  et  ai,  1987]. 


Planning  in  Soar  is  similar  to  planning  in  PRODIGY  in  that  both  systems  use 
a  set  of  preference-based  control  rules  to  yield  a  sequence  of  operators.  One  of 
the  differences  between  these  two  systems  is  that  while  Soar  learns  control  rules 
from  the  result  of  look-ahead  search,  PRODIGY  learns  control  rules  from  its  own 
problem-solving  trace  [Minton,  1988]  or  from  a  static  analysis  of  the  domain  theory 
[Etzioni,  1990a].  Another  difference  is  that  original  PRODIGY  (version  2.0)  uses  a 
linear  planning  approach  [Minton  et  al.,  1989].  Veloso  [1989]  developed  a  nonlinear 
version  of  PRODIGY,  but  the  learning  method  used  was  a  cased-based  approach. 


6.3  Multi-Method  Planning 

The  basic  approach  of  bias  relaxation  in  multi-method  planning  is  similar  to  the 
shift  of  bias  for  inductive  concept  learning  [Russell  and  Grosof,  1987,  Utgoff,  1986]. 
In  the  planning  literature,  this  approach  is  closely  related  to  an  ordering  modifica¬ 
tion  which  is  a  control  strategy  to  prefer  exploring  some  plans  before  others  [Gratch 
and  DeJong,  1990].  If  the  preference  is  wrong,  the  alternatives  will  be  eventually 
reached.  Thus,  ordering  modification  retains  planner  completeness.  They  explicitly 
distinguished  this  modification  from  structural  modification  which  prunes  portions 
of  the  potential  plan  space.  Planning  systems  which  employ  multi-method  plan¬ 
ning  techniques  include  STEPPINGSTONE  [Ruby  and  Kibler,  1991],  and  FAILSAFE- 
2  [Bhatnagar  and  Mostow,  1990].  These  two  systems  are  discussed  in  more  detail 
here,  and  a  comparison  of  multi-method  planning  with  partial  order  planning  is 
presented. 

6.3.1  STEPPINGSTONE 

STEPPINGSTONE  is  a  learning  problem-solver  that  decomposes  a  problem  into  sim¬ 
ple  and  difficult  subproblems.  It  solves  the  simple  subproblems  with  an  inexpensive 
constrained  problem  solver.  To  solve  the  difficult  subproblems,  STEPPINGSTONE 
uses  an  unconstrained  problem  solver.  Once  it  solves  a  difficult  subproblem,  it  uses 
the  solution  to  generate  a  sequence  of  subgoals,  or  steppingstones,  that  can  be  used 
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by  the  constrained  problem  solver  to  solve  this  difficult  subproblem  when  it  occurs 
agun. 

The  constrained  problem  solver  takes  as  input  a  set  of  subgoals  which  are 
ordered  based  on  a  heuristic  called  openness.  It  attempts  to  solve  the  subgoals 
in  the  given  order,  and  generate  a  solution  for  the  subgoals,  if  one  is  found.  The 
constraint  used  in  this  problem  solver  is  that  each  solved  subgoal  is  protected.  If 
the  constrained  problem  solver  is  unable  to  solve  a  subgoal,  a  memmy  component 
is  called.  The  memory  component  is  based  on  a  case-based  approadi.  It  matches 
the  current  problem-solving  context  —  that  is,  the  subgoal  currently  being  solved, 
the  currently  protected  subgoals,  amd  the  current  state  —  with  stored  contexts, 
then  returns  the  ordered  subgoals  for  the  matching  context.  When  the  memory 
component  fails  to  return  any  useful  subgoal  ordering,  the  unconstrained  problem 
solver  is  called.  The  unconstrained  problem  solver  relaxes  the  protection  on  the 
solved  subgoals  to  find  a  solution. 

Since  STEPPINGSTONE  generate  a  solution  according  to  the  prescribed  subgoal 
ordering,  the  constrained  problem  solver  is  comparable  to  M3  (the  linear  planner 
with  protection),  while  the  unconstrained  problem  solver  is  comparable  to  Ms 
(the  linear  planner  without  protection).  This  implies  that  STEPPINGSTONE  is 
dose  to  the  sequential  multi-method  planner  M3  — »  M5;  however,  the  difference 
between  these  two  is  that  STEPPINGSTONE  has  a  cased-based  memory  component 
in  between  M3  and  Ms,  which  is  analogous  to  the  transfer  of  control  rules  across 
problems. 


6.3,2  FAILSAFE-2 

FAILSAFE-2  (FS2)  is  a  system  that  performs  adaptive  search  by  learning  from  its 
failures.  The  FS2  problem  solver  uses  two  types  of  search  control  knowledge:  goal 
selection  rules  to  constrain  the  selection  of  which  goal  to  pick  as  the  next  current 
goal;  and  censors  to  constrain  the  selection  of  which  operator  to  apply  to  the 
current  state. 
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There  are  two  types  of  interactions  between  the  problem  solver  and  learner. 
The  first  type  occurs  when  the  search  is  under-constrained.  The  symptoms  of 
under-constrained  search  include  violating  a  protected  goal,  reaching  a  state-loop, 
and  exceeding  a  preset  goal-depth  limit.  If  any  of  these  symptoms  is  found,  the 
problem  solver  declares  a  failure  and  invokes  the  learner.  If  the  learner  is  able  to 
identify  the  problem  solving  step  that  led  to  the  fulure,  it  adds  a  new  censor  to 
prevent  similar  failures  in  the  future. 

The  other  type  of  interaction  between  the  problem  solver  and  learner  occurs 
when  the  search  is  over-constrained.  Over-construned  search  prunes  away  all  so¬ 
lution  paths.  Domain-independent  heuristics  are  used  to  detect  over-constrained 
search.  When  it  is  detected,  the  problem  solver  calls  a  heuristic  procedure  which 
relaxes  a  censor.  If  relaxing  the  censor  leads  to  achieving  the  current  goal,  FS2 
infers  that  the  censor  was  over-general  and  calls  the  learner  to  specialize  it. 

The  basic  idea  of  censor  relaxation  in  FS2  is  close  to  the  bias  relaxation  mech¬ 
anism  in  the  thesis.  However,  there  are  a  number  of  differences,  such  as  the  gran¬ 
ularity  at  which  censors  are  relaxed  and  the  way  censors  are  relaxed.  Whenever 
applying  an  operator  to  the  current  state  violates  a  censor,  that  state  is  marked  as 
suspended.  Once  the  problem  solver  cannot  make  progress  by  forward  search  with 
the  censor,  FS2  selects  one  of  the  suspended  states  that  is  likely  to  be  closest  to 
the  goal  based  on  a  heuristic,  and  uses  a  weak  form  of  backward  chaining  (WBC) 
which  recurses  on  the  failed  preconditions  of  an  operator  one  at  a  time.  If  a  solu¬ 
tion  is  found  by  this  relaxation,  the  censor  is  specialized,  so  that  so  that  it  does 
not  prevent  the  expansion  of  the  search  tree  in  the  future. 

6.3.3  Partial  Order  Planners 

Fine-grained  multi-method  planning  is  related  to  traditional  partial-order  plan¬ 
ning,  where  heuristics  are  used  to  guide  search  over  the  space  of  partially  ordered 
plans  without  violating  planner  completeness.  For  example,  SNLP  [McAllester  and 
Rosenblitt,  1991,  Barrett  and  Weld,  1992]  uses  a  heuristic  which  prefers  nodes  with 
fewer  unresolved  goals.  Using  directness  in  fine-grained  multi-method  planners  is 
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similar  to  this  heuristic,  because  applying  an  operator  without  violating  directness 
reduces  the  number  of  unachieved  goals  by  at  least  one. 

The  least-commitment  approach  can  be  viewed  as  planning  which  starts  with 
the  strong  assumption  that  the  problem  can  be  solved  without  any  ordering  con- 
stradnts,  and  relaxes  that  assumption  by  adding  ordering  constraints  successively 
only  as  it  is  necessary.  In  this  sense,  it  is  similar  to  the  bias-relaxation  approach 
which  starts  from  a  set  of  biases  and  relaxes  the  biases  only  when  the  problem  (or 
subproblem)  cannot  be  solved  with  those  biases. 


Chapter  7 


Conclusion 

This  chapter  summarizes  the  methodology  used  in  the  thesis  and  the  results,  and 
then  presents  some  of  the  limitations  of  this  methodology  and  future  work. 

7.1  Summary  of  the  Approach  and  Results 

In  this  thesis,  two  hypotheses  are  investigated  in  depth:  (1)  no  single  planning 
method  will  satisfy  both  sufficiency  and  efficiency  for  all  situations;  and  (2)  multi- 
method  planning  can  outperform  single-method  planning  in  terms  of  sufficiency 
and  efficiency.  To  evaluate  these  hypotheses,  a  set  of  single  method  planners  and 
a  set  of  multi-method  planners  have  been  created.  The  creation  of  these  planners 
is  b2ised  on  the  notion  of  bias  in  planning. 

Bias  is  a  useful  notion  in  planning  because  it  can  potentially  reduce  compu¬ 
tation  effort  by  reducing  the  number  of  plans  that  must  be  examined,  and  it  can 
potentially  generate  shorter  plans  by  avoiding  plans  containing  inefficient  operator 
sequences.  By  varying  the  amount  of  bias  used,  a  set  of  planning  methods  with 
different  performance  and  scope  can  be  generated. 

To  evsiluate  the  first  hypothesis,  a  system  has  been  constructed  that  can  uti¬ 
lize  different  single-methods,  which  are  defined  along  two  bias  dimensions:  goal- 
flexibility,  and  goal-protection.  The  goal-flexibility  dimension  determines  the  de¬ 
gree  of  flexibility  the  planner  has  in  generating  new  subgoals  and  in  shifting  the 
focus  in  the  goal  hierarchy.  This  dimension  subsumes  directness  and  linearity 
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biases.  The  goal-protection  dimension  determines  whether  or  not  am  achieved  top- 
level  goad  conjunct  is  protected  between  the  time  it  is  audiieved  amd  the  time  it  is  no 
longer  needed.  By  taking  the  cross-product  of  these  two  dimensions,  six  different 
methods  are  created. 

These  methods  have  been  implemented  in  Sou.  In  Soar  plams  aire  represented 
as  sets  of  variabilized  control  rules  and  sets  of  instantiated  preferences  that  jointly 
specify  which  operators  should  be  executed  at  each  point  in  time.  The  effect  of 
learning  in  these  methods  with  respect  to  the  performance  of  planning  has  been 
investigated.  The  six  implemented  methods  have  been  compaired  empiricadly  in 
terms  of  plamner  completeness,  planning  time,  and  plam  length.  The  experimental 
results  show  a  traide-off  between  completeness  and  efficiency  for  these  methods  — 
that  is,  if  a  method  is  too  restricted,  it  cannot  generate  plans  for  some  problems, 
while  if  it  is  too  relaxed,  it  takes  too  much  time  in  generating  plans,  and  the 
generated  plans  are  inefficient. 

As  an  alternative  approach  to  single-method  planners,  multi-method  planners 
have  been  created.  A  multi-method  planner  consists  of  a  coordinated  set  of  plan¬ 
ning  methods,  where  each  individual  method  has  different  scope  and  performance. 
Given  a  set  of  created  methods,  the  key  issue  here  is  how  to  coordinate  the  methods 
in  an  efficient  manner  so  that  the  multi-method  planner  can  have  high  performance. 
This  includes  issues  of  selecting  appropriate  methods  as  situations  arise,  and  the 
granularity  of  method  switching  as  the  situational  demands  shift. 

For  the  method  selection  issue,  two  ways  of  organizing  individual  methods  in 
a  multi-method  planner  —  sequential  and  time-shared  —  have  been  compared 
analytically.  The  wasted  effort  in  a  sequential  multi-method  planner  is  the  cost  of 
trying  earlier  methods  in  the  sequence,  whereas  the  wasted  effort  in  a  time-shared 
multi-method  planner  is  the  cost  of  trying  all  methods  in  the  method  set  except  the 
one  that  actually  solves  the  problem.  The  wasted  effort  in  sequential  multi-method 
planning  is  sensitive  to  the  ordering  of  the  methods  because  it  takes  too  much  time 
if  earlier  methods  are  not  efficient  enough,  or  in  an  extreme  case,  it  may  not  be 
able  to  generate  a  plan  at  all  if  one  of  the  earlier  methods  does  not  halt.  On  the 
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other  hand,  the  wasted  effort  in  time-shared  multi-method  planning  is  sensitive  to 
the  number  of  individual  methods. 

As  an  approach  to  reducing  the  wasted  time  in  sequential  multi-method  plan¬ 
ning,  monotonic  multi-method  planning  has  been  investigated.  In  a  monotonic 
multi-method  planner,  the  individual  methods  are  ordered  according  to  decreas¬ 
ing  efficiency  and  increasing  coverage  based  on  the  empirical  performance  of  those 
methods  for  a  training  set  of  problems.  A  formal  analysis  shows  that  (1)  a  mono¬ 
tonic  multi-method  planner  takes  less  planning  time  than  the  corresponding  single¬ 
method  planner,  if  the  performance  gain  by  using  a  cheaper  method  is  greater  than 
the  wasted  time  by  using  inappropriate  methods  in  the  monotonic  multi-method 
planner;  and  (2)  the  lengths  of  plans  generated  from  a  monotonic  multi-method 
planner  are  less  than  or  equal  to  the  length  of  plans  generated  from  the  correspond¬ 
ing  single-method  planner. 

To  restrict  the  scope  of  individual  methods  to  be  generated  and  compared,  a 
set  of  bias-relaxation  multi-method  planners  has  been  constructed  based  on  the 
notion  of  effective  bias.  In  a  bias-relaxation  multi-method  planner,  planning  starts 
by  trying  highly  efficient  methods,  and  then  successively  relaxes  effective  biases 
until  a  sufficient  method  is  found. 

The  second  issue  of  coordinating  individual  methods  in  multi-method  planning 
is  the  granularity  at  which  individual  planning  methods  are  be  switched.  While  in 
coarse-grained  multi-method  planners,  methods  are  switched  for  a  whole  problem 
when  no  solution  can  be  found  for  the  problem  within  the  current  method,  in  fine¬ 
grained  multi-method  planners  methods  can  be  switched  at  any  point  during  a 
problem  at  which  a  new  set  of  subgoals  is  formulated,  and  the  switch  only  occurs 
for  that  set  of  subgoaJs  (and  not  for  the  entire  problem).  Both  coarse-grained 
multi-method  planners  and  fine-grained  multi-method  planners  are  implemented 
via  bias  relaxation. 

There  is  a  trade-off  between  coarse-grained  multi-method  planning  and  fine¬ 
grained  multi-method  planning.  A  coarse-griuned  multi-method  planner  finds  a 
solution  within  the  first  method  that  has  one  at  the  cost  of  searching  the  entire 
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biased  space  in  the  worst  case.  On  the  other  hand,  a  fine-grained  multi-method 
planner  can  save  the  effort  of  searching  all  other  alternatives  within  the  current 
method;  however,  it  does  not  guarantee  to  find  a  solution  that  may  exist  within 
the  current  biased  space. 

The  experimental  results  in  the  blocks-world  and  machine-shop-scheduling  do¬ 
mains  imply  that  (1)  in  terms  of  planning  time,  hne-grained  multi-method  plan¬ 
ners  can  be  significantly  more  efficient  than  coarse-grained  multi-method  planners 
and  single-method  planners;  and  (2)  in  terms  of  plan  length,  both  fine-grained 
and  coarse-grained  multi-method  planners  can  be  significantly  more  efficient  than 
single-method  planners. 

In  summary,  the  primary  contribution  of  this  thesis  is  to  develop  a  new  multi¬ 
method  planning  framework.  This  framework  is  developed  based  on  the  notion 
of  bias  (for  method  creation),  and  the  notions  of  monotonicity,  bias-relaxation, 
and  the  granularity  of  method  switching  (for  method  coordination).  The  exper¬ 
imental  results  indicate  that,  at  least  for  the  domains  investigated,  the  created 
multi-method  planners  are  more  efficient  than  complete  single-method  planners. 

7.2  Limitations  and  Future  Work 

The  multi-method  planning  framework  investigated  in  this  thesis  is  based  on  three 
biases:  linearity,  protection,  and  directness.  One  way  to  enhance  the  multi-method 
planning  framework  would  be  to  extend  the  set  of  biases  available.  These  biases 
include  ones  that  limit  the  size  of  the  goal  hierarchy  such  as  goal-depth  or  goal- 
breadth  (to  reduce  the  search  space),  limit  the  length  of  plans  generated  such  as 
plan-length  (to  shorten  execution  time),  and  lead  to  learning  more  effective  rules 
such  as  goal-nonrepetition  (to  increase  transfer). 

The  multi-method  planners  used  here  do  not  guarantee  finding  optimal  plans 
for  a  given  problem.  However,  if  a  plan-length  biais  is  incorporated  with  coarse¬ 
grained  multi-method  planning,  where  the  bound  is  incrementally  specified  along 
with  the  sequence  of  individual  methods,  the  multi-method  planner  will  be  able  to 


find  optimal  plans  for  all  problems.  In  fact,  this  approach  implements  depth-first 
iterative-deepening  [Korf,  1985]  on  the  length  of  plans  generated. 

The  bias  selection  approach  used  here  is  based  on  preprocessing  a  set  of  traiin- 
ing  examples  in  order  to  develop  fixed  sequences  of  biases  (and  methods).  This 
approach  has  a  limitation  when  it  is  hard  to  generate  testing  problems  or  when 
the  problem  distribution  is  unknown.  A  more  dynamic,  run-time  approach  would 
be  to  learn,  while  doing,  which  biases  (and  methods)  to  use  for  which  classes  of 
problems.  If  such  learned  information  can  transfer  to  the  later  problems,  much  of 
the  effort  wasted  in  trying  inappropriate  methods  may  be  reduced. 

One  problem  of  learning  about  which  methods  to  use  for  which  classes  of  prob¬ 
lems  in  the  current  multi-method  framework  is  that  some  rules  (chunks)  are  ex¬ 
pensive.  Restricting  expressiveness  on  the  encoding  of  tasks  such  as  by  the  unique- 
attribute  scheme  can  solve  this  problem  [Tambe  ti  o/.,  1990];  however  the  learned 
rules  based  on  this  scheme  may  not  be  general  enough  to  transfer  to  the  later 
problems  because  of  the  limited  expressibility.  Another  approach  to  solving  the 
expensive  churk  problem  is  to  incorporate  search  control  knowledge  into  the  ex¬ 
planation  [Kim  and  Rosenbloom,  1993].  This  approach  can  solve  expensive  chunk 
problem  without  restricting  expressiveness.  Learned  rules  can  be  used  with  the 
cost  bounded  by  the  cost  of  the  problem  solving  from  which  it  was  learned. 

The  methodology  for  generating  a  set  of  monotonic  multi-method  planners  or  a 
set  of  bias-relsocation  multi-method  planners  does  not  specify  which  multi-method 
planner  is  the  optimaJ  one  for  a  given  problem  distribution.  Greiner  [1992]  devel¬ 
oped  an  algorithm  called  PALO  which  searches  the  space  of  performance  elements 
and  selects  a  near  locally  optimal  element  by  using  statistical  techniques  to  ap¬ 
proximate  the  distribution.  By  employing  the  PALO  algorithm  in  the  multi-method 
planning  framework,  it  may  be  possible  to  generate  the  rptimal  multi-method  plan¬ 
ner  for  a  given  distribution. 

Within  simulated  battlefield  environments,  the  focus  of  this  thesis  is  on  planning 
based  on  beyond-visual-range  1-v-l  aggressive  bogey  scenario.  One  of  the  direc¬ 
tions  for  future  work  in  this  domain  includes  applying  the  technique  described  in 


Chapter  5  to  other  scenarios,  such  as  l-v-2,  2*v-l,  and  2-v-n  scenarios,  or  Within 
Visual  Range  scenario.  Application  to  ur-to-ground  or  ground-to-ground  combat 
simulation  would  be  another  possibility. 
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Appendix  A 


Experimental  Results:  The  Blocks- World 
Domain 


This  appendix  gives  the  detailed  numeric  information  from  the  experiments  in 
the  blocks- world  domain.  Appendix  A.l  presents  the  experimental  results  for  the 
six  single-method  planners  over  30  truning  problems.  Appendix  A. 2  presents  the 
experimental  results  for  the  six  single-method  planners  and  the  created  multi¬ 
method  planners  over  100  test  problems. 
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A.l  Performance  over  30  Training  Problems 
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Table  A.2:  Performance  of  M2  over  30  training  problems  in  the  blocks-world  do¬ 
main. 
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Table  A.5;  Performance  of  over  30  training  problems  in  the  blocks-world  do¬ 
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Table  A.6:  Performance  of  Me  over  30  training  problems  in  the  blocks-world  do¬ 
main. 


A. 2  Performance  over  100  Testing  Problems 

The  entries  in  the  tables  are  defined  as  follows: 

N:  Problem  number 
B:  Number  of  blocks 
G:  Number  of  goal  conjuncts 
SP:  Solved  problems. 
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D:  Number  of  decisions 
L:  Plan  length 
C:  CPU  time  (sec.) 
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Table  A. 7:  Performance  of  single-method  planners  over  100  testing  problems  in  the 
blocks-world  domain. 
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Table  A.9:  Performance  of  coarse-grained  multi-method  planners  over  100  testing 
problems  in  the  blocks-world  domain. 
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1 

0.22 
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4 

15 

3 

0.46 

15 

3 

0.45 

15 

3 

0.45 

15 

3 

0.44 

29 

3 

1.03 
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3 

37 

3 

0.59 

33 

3 

0.64 

27 

2 

0.49 

34 

2 

0.43 
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2 
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3 
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3 

0.33 

14 

2 
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3 

0.32 
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3 

0.47 

14 

2 

0.22 
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3 
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3 

0.26 

15 

3 

0.26 

15 

3 

0.26 
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3 

0.77 

15 

3 

0.26 
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3 

15 

3 

0.37 
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3 

0.27 

15 

3 

0.36 

32 

3 

0.52 

24 

3 

0.49 

100  4 

4 

15 

3 

0.46 

15 

3 

0.45 

15 

3 

0.45 

63 

3 

3.75 

34 

3 

0.78 

3613  358  86.33 
100  100  100 


36.13  3.58  0.86 


3679  394  83.06 
100  100  100 


3604  303  82.17 
100  100  100 


3934  258  115.89 
100  100  100 


3838  393  93.04 
100  100  100 


36.79  3.94  0.83  36.04  3.03  0.82  29.34  2.58  1.161  38.38  2.93  0.93 


Table  A. 10:  Performance  of  coarse-grained  multi-method  planners  over  100  testing 
problems  in  the  blocks-world  domain  (continued). 
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2 
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4 
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1 
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13 
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40 
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2 

3 

0 
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0 
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3 
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3 

2 
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2 
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0.56 
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0.50 
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3 
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28 

3 

0.66 

33 

4 
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4 
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0.24 

13 

1 

0.23 

13 

1 
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3 
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3 
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3 

0.37 
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2.36 
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3 

0.37 

45 

4 

4 

21 

2 

0.70 
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2 
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21 

2 
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2 

0.71 
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2 

0.68 

46 

4 

4 

14 

2 

0.34 

14 

2 

0.34 

14 

2 

0.34 
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14 

2 

0.33 
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3 

2 
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3 

0.66 
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3 

0.63 
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3 

0.50 

22 

2 

0.45 

27 

3 

0.50 

48 

3 

3 

13 

1 

0.19 

13 

1 

0.19 

13 

1 

0.19 

13 

1 

0.19 

13 

1 

0.19 

49 

4 

4 

16 

4 

0.54 

16 

4 

0.54 

16 

4 

0.54 

39 

5 

1.68 

43 

4 

1.73 

50 

3 

3 

15 

3 

0.27 

15 

3 

0.28 

15 

3 

0.27 

30 

3 

0.78 

24 

3 

0.49 

Table  A.ll:  Performance  of  coarse-grained  multi-method  planners  over  100  testing 
problems  in  the  blocks-world  domain  (continued). 
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Ka 
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100 
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iBEsra: 

1  38.91  3.59 

1.03 

37.30  3.03 
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31.77  2.97 

1.34 

30.67  2.57 

1.05 

34.47  2.95 

1.351 

Table  A.12:  Performance  of  coarse-grained  multi-method  planners  over  100  testing 
problems  in  the  blocks- world  domain  (continued). 
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Table  A. 13:  Performance  of  fine-grained  multi-method  planners  over  100  testing 
problems  in  the  blocks-world. 
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Table  A.14:  Performance  of  fine-grained  multi-method  planners  over  100  testing 
problems  in  the  blocks- world  (continued). 
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1.39 

30 

4 

0.85 

13 

4 

0.53 

66 

5 

3.37 

29 

6 

1.51 

33 

3 

3 

8 

1 

0.10 

8 

1 

0.11 

8 

1 

0.11 

8 

1 

0.10 

8 

1 

0.10 

34 

3 

3 

9 

3 

0.14 

9 

3 

0.14 

9 

2 

0.13 

9 

3 

0.13 

16 

2 

0.33 

35 

4 

3 

19 

4 

0.76 

31 

10 

1.51 

31 

9 

1.88 

31 

5 

0.93 

51 

11 

2.71 

36 

4 

3 

8 

1 

0.14 

8 

1 

0.14 

8 

1 

0.14 

8 

1 

0.14 

8 

1 

0.14 

37 

3 

3 

8 

1 

0.10 

8 

1 

0.10 

8 

1 

0.10 

8 

1 

0.11 

8 

1 

0.10 

38 

4 

3 

37 

4 

1.09 

33 

7 

1.49 

37 

4 

1.14 

57 

5 

2.47 

82 

12 

4.87 

39 

3 

3 

8 

1 

0.10 

8 

1 

0.09 

8 

1 

0.09 

8 

1 

0.10 

8 

1 

0.09 

40 

3 

3 

3 

0 

0.03 

3 

0 

0.03 

3 

0 

0.03 

3 

0 

0.03 

3 

0  0.04 

41 

3 

3 

10 

3 

0.16 

10 

3 

0.15 

10 

3 

0.16 

17 

2 

0.30 

10 

2 

0.15 

43 

3 

3 

30 

4 

0.47 

13 

4 

0.36 

18 

3 

0.37 

18 

3 

0.36 

12 

4 

0.24 

43 

4 

3 

8 

1 

0.14 

8 

1 

0.14 

8 

1 

0.14 

8 

1 

0.14 

8 

1 

0.14 

44 

4 

3 

10 

3 

0.37 

10 

3 

0.38 

10 

3 

0.38 

58 

3 

2.34 

10 

3 

0.27 

45 

4 

4 

16 

3 

0.48 

16 

3 

0.47 

16 

2 

0.48 

16 

3 

0.48 

16 

2 

0.47 

46 

4 

4 

9 

3 

0.33 

9 

3 

0.33 

9 

2 

0.23 

16 

3 

0.43 

9 

2 

0.23 

47 

3 

3 

17 

3 

0.34 

19 

3 

0.42 

19 

3 

0.40 

19 

3 

0.37 

19 

3 

0.41 

48 

3 

3 

8 

1 

0.10 

8 

1 

0.10 

8 

1 

0.11 

8 

1 

0.10 

8 

1 

0.11 

49 

4 

4 

11 

4 

0.41 

11 

4 

0.43 

11 

4 

0.42 

18 

4 

0.66 

38 

4 

1.69 

50 

3 

3 

10 

3 

0.17 

10 

3 

0.17 

10 

3 

0.17 

17 

3 

0.35 

17 

3 

0.34 

Table  A. 15:  Performance  of  fine-grained  multi-method  planners  over  100  testing 
problems  in  the  blocks- world  (continued). 
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51  4 

3 

10 

3 

OJO 

10 

3 

0.39 

10 

3 

0.39 

56 

5 

3J8 

10 

3 

0.38 

83  3 

3 

8 

1 

0.10 

8 

1 

0.11 

8 

1 

0.11 

8 

1 

0.11 

8 

1 

0.11 

S3  3 

3 

9 

3 

0.13 

9 

3 

0.13 

9 

3 

0.12 

9 

3 

0.13 

35 

7 

0.73 

54  4 

3 

10 

3 

0.30 

10 

3 

OJO 

10 

3 

0.39 

40 

4 

1.47 

10 

3 

0.38 

55  4 

4 

11 

4 

0J8 

11 

4 

0.40 

11 

4 

0.38 

48 

5 

3.38 

11 

4 

0J7 

56  3 

3 

8 

1 

OM 

8 

1 

OUM 

8 

1 

0.10 

8 

1 

OM 

8 

1 

0.10 

57  4 

4 

9 

3 

0.34 

9 

3 

0.34 

9 

3 

0.33 

34 

3 

0.78 

9 

3 

0.33 

58  3 

3 

9 

3 

0.13 

9 

3 

0.13 

9 

3 

0.13 

33 

3 
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9 

3 
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4 

30 

6 
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15 

6 

0.83 

IS 

6 
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64 

6 
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30 

6 
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3 

3 

0 
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3 

0 

0.03 

3 

0 
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3 

0 
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3 

0 

OM 
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3 

9 

3 
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9 

3 
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9 

3 

0.14 

34 

3 

0.58 

9 

3 
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3 

18 

3 

0.38 

37 

11 
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18 

3 

0J8 

34 

4 

0.88 

57 

IS 

3.33 
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3 

9 

3 

0.13 

9 

3 

0.13 

9 

2 

0.13 

9 

3 

0.12 

16 

3 

0.39 

64  3 

3 

9 

3 

0.13 

9 

3 

0.13 

9 

3 

0.13 

16 

3 

0.31 

9 

3 

0.13 

65  3 

3 

9 

3 

0.13 

9 

3 

0.13 

9 

2 

0.13 

16 

3 

0.38 

9 

3 

0.12 

66  4 

3 

8 

1 

0.15 

8 

1 

0.14 

8 

1 

0.14 

8 

1 

0.14 

8 

1 

0.14 

67  3 

3 

9 

3 

0.13 

9 

3 

0.12 

9 

3 

0.13 

34 

3 

037 

9 

3 

0.13 

68  4 

3 

3 

0 

0.04 

3 

0 

0.04 

3 

0 

0.04 

3 

0 

0.04 

3 

0 

0.04 

68  4 

3 

48 

8 

3.65 

53 

13 

3J7 

37 

4 

1.11 

33 

4 

1.18 

103 

19 

7.52 

TO  3 

3 

13 

4 

0.38 

11 

3 

0.21 

13 

4 

0.38 

13 

4 

0.37 

30 

4 

0.46 

71  3 

3 

8 

1 

0.10 

8 

1 

0.10 

8 

1 

0.09 

8 

1 

0.09 

8 

1 

0.10 

73  3 

3 

8 

1 

0.10 

8 

1 

0.11 

8 

1 

0.11 

8 

1 

0.11 

8 

1 

0.10 

73  4 

3 

8 

1 

0.15 

8 

1 

0.14 

8 

1 

0.14 

8 

1 

0.14 

8 

1 

0.14 

74  4 

3 

9 

3 

0.30 

9 

3 

0.20 

9 

3 

0.20 

44 

6 

2.07 

9 

3 

0.30 

75  3 

3 

11 

3 

0.19 

11 

3 

0.19 

11 

3 

0.19 

33 

3 

0.79 

11 

3 

0.19 

76  3 

3 

9 

3 

0.13 

9 

2 

0.13 

9 

2 

0.13 

9 

2 

0.13 

16 

2 

0.39 

77  4 

3 

31 

5 

0.83 

13 

4 

0.49 

13 

4 

0.49 

104 

6 

5.00 

45 

7 

3.15 

78  4 

3 

17 

3 

0.48 

17 

3 

0.48 

17 

3 

0.46 

17 

3 

0.40 

17 

3 

0.47 

79  4 

3 

18 

3 

0.56 

18 

3 

0.55 

27 

4 

0.95 

56 

4 

3.45 

27 

4 

0.95 

80  3 

3 

3 

0 

0.04 

3 

0 

0.03 

3 

0 

0.04 

3 

0 

0.04 

3 

■■M'Ktn 

81  3 

3 

10 

3 

0.17 

10 

3 

0,17 

10 

3 

0.17 

17 

3 

036 

17 

3 

OM 

83  3 

3 

9 

3 

0.13 

9 

3 

0.14 

9 

3 

0.14 

35 

3 

0.63 

9 

3 

0.13 

83  3 

3 

9 

3 

0.13 

9 

3 

0.14 

9 

3 

0.14 

9 

3 

0.14 

16 

a 

0.31 

84  3 

3 

32 

4 

0.56 

11 

3 

0.32 

11 

3 

0.23 

33 

11 

1.33 

11 

3 

0.31 

85  3 

3 

10 

3 

0.17 

10 

3 

0.17 

10 

3 

0.17 

41 

4 

1.38 

31 

7 

0.97 

86  3 

3 

8 

1 

0.10 

8 

1 

0.10 

8 

1 

0.09 

8 

1 

0.10 

8 

1 

0.10 

87  4 

3 

8 

1 

0.14 

8 

1 

0.14 

8 

1 

0.15 

8 

1 

0.14 

8 

1 

0.15 

88  4 

3 

36 

5 

1.51 

19 

3 

0.61 

36 

5 

1.50 

49 

3 

1.75 

19 

3 

0.61 

89  3 

3 

8 

1 

0.09 

8 

1 

0.09 

8 

1 

0.10 

8 

1 

0.10 

8 

1 

0.09 

90  4 

3 

33 

4 

1.33 

37 

9 

5.10 

56 

14  13.64 

73 

7 

3.39 

51 

10 

9.30 

91  3 

3 

11 

3 

0.31 

11 

3 

0.31 

13 

4 

0.38 

11 

3 

0.19 

18 

3 

0.37 

93  4 

4 

11 

4 

0.43 

11 

4 

0.43 

11 

4 

0.42 

26 

4 

1.05 

50 

11 

3.84 

93  4 

3 

18 

3 

0.55 

18 

3 

0.54 

27 

4 

0.96 

35 

3 

0.74 

27 

4 

0.95 

94  4 

3 

8 

1 

0.14 

8 

1 

0.14 

8 

1 

0.15 

8 

1 

0.14 

8 

1 

0.14 

95  4 

4 

10 

3 

0.33 

10 

3 

0.33 

10 

3 

0.33 

10 

3 

0.31 

30 

3 

1.30 

96  3 

3 

10 

3 

0.15 

10 

3 

0.16 

10 

3 

0.16 

19 

3 

0.37 

10 

2 

0.15 

97  3 

3 

9 

3 

0.14 

9 

3 

0.14 

9 

3 

0.14 

35 

3 

0.63 

9 

3 

0.14 

98  3 

3 

10 

3 

0.17 

10 

3 

0.17 

10 

3 

0.16 

34 

3 

0.87 

10 

3 

0.16 

99  3 

3 

10 

3 

0.17 

10 

3 

0.17 

10 

3 

0.17 

17 

3 

0.35 

22 

6 

0.58 

100  4 

4 

10 

3 

0J3 

10 

3 

0.34 

10 

3 

0.33 

25 

3 

0.91 

17 

3 

0.54 

1356  359  43.63 
100  100  100 


13.56  3.59  0.43 


1363  397  50.04  1333  373  53.76 

100  100  100  100  100  100 


3433  371  84.96  1983  346  82.91 

100  100  100  100  100  100 


13.63  3.97  0.50  13.33  3.73  0.53  34.33  3.71  0.85  19.83  3.46  0.83 


Table  A. 16:  Performance  of  fine-grained  multi-method  planners  over  100  testing 
problems  in  the  blocks-world  (continued). 
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Appendix  B 


Experimental  Results:  The  Machine-Shop 
Scheduling  Domain 


This  appendix  gives  the  detailed  numeric  information  from  the  experiments  in 
the  machine-shop  scheduling  domain.  Appendix  B.l  presents  the  experimental 
results  for  the  six  single-method  planners  over  30  training  problems.  Appendix  B.2 
presents  the  experimental  results  for  the  six  single-method  planners  and  the  created 
multi-method  planners  over  100  test  problems. 
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B.l  Performance  over  30  Training  Problems 


Table  B.l:  Performance  of  Mi  over  30  training  problems  in  the  machine-shop 
scheduling  domain. 
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MetluMlMa 


\KSS2l 


D 

Km 

3 

5 

34 

34 

36 

34.7 

3 

5 

33 

33 

33 

33.0 

4 

5 

17 

33 

17 

19.0 

S 

5 

36 

34 

36 

35.3 

6 

5 

33 

33 

33 

33.0 

7 

5 

. 

• 

. 

• 

8 

5 

39 

33 

39 

36.7 

9 

5 

• 

- 

. 

10 

5 

16 

16 

16 

16.0 

11 

5 

. 

• 

• 

. 

13 

5 

33 

33 

17 

31.0 

13 

5 

16 

16 

16 

16.0 

14 

5 

35 

35 

17 

33.3 

15 

5 

34 

34 

34 

34.0 

16 

5 

35 

35 

17 

33.3 

17 

5 

. 

- 

- 

- 

18 

5 

• 

• 

• 

• 

19 

5 

. 

. 

. 

30 

5 

34 

34 

40 

39.3 

31 

5 

8 

8 

8 

8.0 

33 

5 

- 

• 

- 

- 

33 

5 

35 

35 

33 

34.3 

34 

5 

40 

40 

34 

34.7 

35 

5 

33 

33 

34 

33.7 

36 

5 

16 

16 

16 

16.0 

37 

5 

8 

8 

8 

8.0 

38 

5 

. 

• 

• 

- 

39 

5 

16 

16 

16 

16.0 

30 

5 

34 

34 

34 

34.0 

490 

33 

493 

33 

478 

33 

487.0 

33.0 

Table  B.2:  Performance  of  Afj  over  30  training  problems  in  the  machine-shop 
scheduling  domain. 
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Table  B.3:  Performance  of  A/3  over  30  training  problems  in  the  machine-shop 
scheduling  domain. 
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ethoaM« 


IliCSi 


BsiSTi: 


1 

5 

40 

32 

57 

43.0 

2 

5 

39 

33 

33 

34.3 

3 

5 

39 

48 

48 

45.0 

4 

5 

39 

40 

39 

39.3 

5 

5 

39 

33 

33 

34J 

6 

5 

33 

33 

33 

39.0 

1 

5 

72 

55 

56 

61.0 

8 

5 

48 

41 

57 

48.7 

9 

5 

48 

40 

57 

48.3 

10 

5 

16 

23 

33 

30.7 

11 

5 

16 

35 

35 

33i> 

13 

5 

39 

39 

40 

393 

13 

5 

24 

33 

34 

36.7 

14 

5 

39 

63 

56 

52.7 

15 

5 

33 

24 

32 

393 

16 

5 

48 

40 

65 

51.0 

IT 

5 

34 

34 

40 

393 

18 

5 

42 

40 

41 

41.0 

19 

5 

40 

40 

40 

40.0 

30 

5 

41 

57 

41 

463 

31 

5 

34 

34 

34 

34.0 

32 

5 

40 

55 

39 

44.7 

33 

5 

40 

40 

40 

40.0 

34 

5 

73 

41 

41 

51.7 

35 

5 

41 

33 

41 

38.0 

36 

5 

33 

34 

16 

31.0 

3T 

5 

16 

15 

IS 

153 

38 

5 

48 

56 

73 

59.0 

39 

5 

16 

23 

16 

183 

30 

5 

34 

34 

40 

39.3 

Total 

sd  Problema 

1093 

30 

1093 

30 

1183 

30 

1133.7 

30.0 

KQ 

4 

4 

4 

4.0 

4 

6 

6 

S3 

4 

5 

4 

4.3 

4 

4 

4 

4.0 

2 

4 

4 

33 

9 

6 

7 

7.3 

6 

6 

8 

6.7 

6 

5 

8 

63 

3 

3 

2 

2.0 

3 

4 

4 

3.3 

4 

4 

5 

4.3 

3 

4 

3 

3.3 

4 

7 

7 

6.0 

4 

3 

4 

3.7 

6 

5 

9 

6.7 

3 

3 

5 

3.7 

7 

5 

6 

6.0 

5 

5 

5 

5.0 

6 

8 

6 

6.7 

3 

3 

3 

3.0 

5 

6 

4 

5.0 

5 

5 

5 

5.0 

10 

6 

6 

7.3 

6 

4 

6 

S3 

3 

3 

3 

3.3 

2 

1 

1 

1.3 

Table  B.4:  Performance  of  A/4  over  30  training  problems  in  the  machine-shop 
scheduling  domain. 
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Table  B.5:  Performance  of  Ms  over  30  training  problems  in  the  machine-shop 
scheduling  domain. 
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Table  B.6:  Performance  of  Me  over  30  training  problems  in  the  machine-shop 
scheduling  domain. 


B.2  Performance  over  100  Testing  Problems 

The  entries  in  the  tables  are  defined  as  follows: 


N:  Problem  number 

D: 

Number  of  decisions 

B:  Number  of  blocks 

L: 

Plan  length 

G:  Number  of  goal  conjuncts 

C: 

CPU  time  (sec.) 

SP:  Solved  problems. 
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D  L  C  I 
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3 

0.67 
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3 

0.67 
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3 

0.68 

36 

3 

0.53 

36 

3 

0.53 

36 

3 

0.54 
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3 
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17 

3 

0.33 

17 

3 

0.33 

17 

3 
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17 

3 
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17 

3 
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36 

3 
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36 

3 
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36 

3 
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16 

3 
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16 

3 
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16 

3 
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33 

4 
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33 

4 
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33 

4 
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16 

2 
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16 

3 
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16 

3 
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3 
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35 

3 
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35 

3 
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16 

3 

0.31 

16 

3 
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16 

3 
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17 

3 

0.33 

17 

3 
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17 

3 
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34 

3 
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34 

3 
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34 

3 
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17 

3 

0.33 

17 

3 
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17 

3 
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34 

3 
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34 

3 
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34 

3 
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8 

1 
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8 

1 
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8 

1 
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25 

3 
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35 

3 
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25 

3 
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30 

3 

0.77 

30 

3 
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30 

3 
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34 

4 
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34 

4 
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4 
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3 
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3 
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3 
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1 
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1 
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8 

1 
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3 
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3 
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34 

3 

0.51 

34 

3 
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8 

1 

0.16 

8 

1 

0.16 

8 

1 
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36 

3 

0.54 

36 

3 
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36 

3 
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24 

3 
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34 

3 

0.50 

34 

3 

0.52 

24 

3 

0.53 

34 

3 

0.53 

34 

3 
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16 

3 

0.30 

16 

3 
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16 

2 

0.31 

16 

3 

0.31 

16 

3 
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3 

0.31 
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3 
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16 

3 
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16 

3 
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33 

4 
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33 

4 
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33 

4 
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40 

5 

1.35 

41 

6 

1.05 

41 

6 

1.05 

41 

6 

1.06 

71 

8 

3.33 

71 

8 

3.33 

71 

8 

3.34 

50 

8 

1.73 

50 

8 

1.74 

SO 

8 

1.74 

41 

6 

1.07 

41 

6 

1.07 

41 

6 

1.07 

43 

7 

1.53 

43 

7 

1.54 

43 

7 

1.55 

40 

5 

0.99 

40 

5 

oao 

40 

5 

0J8 

33 

4 

0.75 

33 

4 

0.76 

33 

4 

0.75 

57 

8 

1.70 

57 

8 

1.67 

57 

8 

1.67 

33 

3 

0.48 

33 

3 

0.49 

33 

3 

0.49 

34 

6 

1.15 

34 

6 

1.15 

34 

6 

1.18 

50 

8 

1.78 

50 

8 

1.78 

50 

8 

1.77 

24 

3 

0.55 

34 

3 

0.55 

34 

3 

0.56 

55 

6 

1.43 

55 

6 

1.43 

55 

6 

1.43 

24 

3 

0.53 

34 

3 

0.53 

34 

3 

0.53 

81 

11 

3.97 

81 

11 

2.96 

81 

11 

2.96 

34 

3 

0.55 

34 

3 

0.56 

34 

3 

0.56 

58 

9 

1.76 

58 

9 

1.76 

58 

9 

1.77 

47 

5 

1.39 

47 

5 

1.27 

47 

5 

1.28 

33 

4 

0.76 

33 

4 

0.75 

33 

4 

0.75 

15 

1 

0.39 

15 

1 

0.30 

15 

1 

0.29 

41 

6 

1.08 

41 

6 

1.08 

41 

6 

1.07 

41 

6 

1.08 

41 

6 

1.08 

41 

6 

1.07 

40 

5 

0.99 

40 

5 

0.99 

40 

5 

0.99 

41 

6 

1.04 

41 

6 

1.06 

41 

6 

1.05 

16 

2 

0.31 

16 

3 

0.31 

16 

3 

0.31 

15 

1 

0.39 

15 

1 

0.30 

15 

1 

0.30 

56 

7 

1.59 

56 

7 

1.59 

56 

7 

1.59 

16 

3 

0.31 

16 

3 

0.31 

16 

3 

0.31 

34 

3 

0.53 

34 

3 

0.51 

34 

3 

0.52 

15 

1 

0.30 

15 

1 

0.39 

15 

1 

0.39 

48 

6 

1.33 

48 

6 

1.24 

48 

6 

1.33 

33 

5 

0.81 

33 

5 

0.81 

33 

5 

0.81 

24 

3 

0.51 

24 

3 

0.50 

24 

3 

0.51 

34 

3 

0.51 

34 

3 

0.51 

34 

3 

0.51 

40 

5 

0.99 

40 

5 

0.99 

40 

5 

0.99 

31 

3 

0.69 

31 

3 

0.69 

31 

3 

0.69 

32 

4 

0.76 

33 

4 

0.77 

32 

4 

0.77 

34 

6 

1.07 

34 

6 

1.07 

34 

6 

1.08 

16 

3 

0.31 

16 

3 

0.31 

16 

3 

0.31 

16 

3 

0.33 

16 

3 

0.31 

16 

3 

0.31 

40 

5 

0.96 

40 

5 

0.96 

40 

5 

0.96 

40 

5 

0.97 

40 

5 

0.95 

40 

5 

0.97 

33 

2 

0.47 

33 

3 

0.47 

33 

2 

0.46 

73 

10 

2.31 

73 

10 

3.30 

73 

10 

2.29 

34 

3 

0.53 

24 

3 

0.53 

34 

3 

0.53 

33 

3 

0.47 

23 

2 

0.47 

23 

3 

0.47 

33 

5 

0.96 

33 

5 

0.97 

33 

5 

0.97 

49 

7 

1.41 

49 

7 

1.43 

49 

7 

1.43 

65 

9 

2.02 

65 

9 

2.02 

65 

9 

2.03 

Table  B.7:  Performance  of  single-method  planners  over  100  testing  problems  in  the 
machine-shop  scheduling  domain. 
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Table  B.8:  Performance  of  single-method  planners  over  100  testing  problems  in  the 
machine-shop  scheduling  domain  (continued). 
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Table  B.9:  Performance  of  multi-method  planners  over  100  testing  problems  in  the 
ma^diine-shop  scheduling  domain. 
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Table  B.IO:  Performance  of  multi-method  planners  over  100  testing  problems  in 
the  machine-shop  scheduling  domain  (continued). 
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